1. 26 11月, 2008 4 次提交
    • T
      fuse: add file kernel handle · acf99433
      Tejun Heo 提交于
      The file handle, fuse_file->fh, is opaque value supplied by userland
      FUSE server and uniqueness is not guaranteed.  Add file kernel handle,
      fuse_file->kh, which is allocated by the kernel on file allocation and
      guaranteed to be unique.
      
      This will be used by poll to match notification to the respective file
      but can be used for other purposes where unique file handle is
      necessary.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      acf99433
    • T
      fuse: implement ioctl support · 59efec7b
      Tejun Heo 提交于
      Generic ioctl support is tricky to implement because only the ioctl
      implementation itself knows which memory regions need to be read
      and/or written.  To support this, fuse client can request retry of
      ioctl specifying memory regions to read and write.  Deep copying
      (nested pointers) can be implemented by retrying multiple times
      resolving one depth of dereference at a time.
      
      For security and cleanliness considerations, ioctl implementation has
      restricted mode where the kernel determines data transfer directions
      and sizes using the _IOC_*() macros on the ioctl command.  In this
      mode, retry is not allowed.
      
      For all FUSE servers, restricted mode is enforced.  Unrestricted ioctl
      will be used by CUSE.
      
      Plese read the comment on top of fs/fuse/file.c::fuse_file_do_ioctl()
      for more information.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      59efec7b
    • T
      fuse: don't let fuse_req->end() put the base reference · e9bb09dd
      Tejun Heo 提交于
      fuse_req->end() was supposed to be put the base reference but there's
      no reason why it should.  It only makes things more complex.  Move it
      out of ->end() and make it the responsibility of request_end().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      e9bb09dd
    • M
      fuse: style fixes · 1729a16c
      Miklos Szeredi 提交于
      Fix coding style errors reported by checkpatch and others.  Uptdate
      copyright date to 2008.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      1729a16c
  2. 02 11月, 2008 1 次提交
    • A
      saner FASYNC handling on file close · 233e70f4
      Al Viro 提交于
      As it is, all instances of ->release() for files that have ->fasync()
      need to remember to evict file from fasync lists; forgetting that
      creates a hole and we actually have a bunch that *does* forget.
      
      So let's keep our lives simple - let __fput() check FASYNC in
      file->f_flags and call ->fasync() there if it's been set.  And lose that
      crap in ->release() instances - leaving it there is still valid, but we
      don't have to bother anymore.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      233e70f4
  3. 23 10月, 2008 1 次提交
  4. 16 10月, 2008 4 次提交
  5. 14 10月, 2008 1 次提交
  6. 27 7月, 2008 4 次提交
  7. 26 7月, 2008 5 次提交
  8. 18 6月, 2008 1 次提交
  9. 25 5月, 2008 1 次提交
  10. 13 5月, 2008 1 次提交
    • M
      fuse: add flag to turn on big writes · 78bb6cb9
      Miklos Szeredi 提交于
      Prior to 2.6.26 fuse only supported single page write requests.  In theory all
      fuse filesystem should be able support bigger than 4k writes, as there's
      nothing in the API to prevent it.  Unfortunately there's a known case in
      NTFS-3G where big writes cause filesystem corruption.  There could also be
      other filesystems, where the lack of testing with big write requests would
      result in bugs.
      
      To prevent such problems on a kernel upgrade, disable big writes by default,
      but let filesystems set a flag to turn it on.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Szabolcs Szakacsits <szaka@ntfs-3g.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      78bb6cb9
  11. 01 5月, 2008 1 次提交
  12. 30 4月, 2008 9 次提交
    • M
      fuse: fix sparse warnings · 4dbf930e
      Miklos Szeredi 提交于
      fs/fuse/dev.c:306:2: warning: context imbalance in 'wait_answer_interruptible' - unexpected unlock
      fs/fuse/dev.c:361:2: warning: context imbalance in 'request_wait_answer' - unexpected unlock
      fs/fuse/dev.c:1002:4: warning: context imbalance in 'end_io_requests' - unexpected unlock
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4dbf930e
    • M
      fuse: fix race in llseek · 5559b8f4
      Miklos Szeredi 提交于
      Fuse doesn't use i_mutex to protect setting i_size, and so
      generic_file_llseek() can be racy: it doesn't use i_size_read().
      
      So do a fuse specific llseek method, which does use i_size_read().
      
      [akpm@linux-foundation.org: make `retval' loff_t]
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5559b8f4
    • M
      fuse: fix node ID type · b48badf0
      Miklos Szeredi 提交于
      Node ID is 64bit but it is passed as unsigned long to some functions.  This
      breakage wasn't noticed, because libfuse uses unsigned long too.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b48badf0
    • M
      fuse: fix max i/o size calculation · e5d9a0df
      Miklos Szeredi 提交于
      Fix a bug that Werner Baumann reported: fuse can send a bigger write request
      than the maximum specified.  This only affected direct_io operation.
      
      In addition set a sane minimum for the max_read and max_write tunables, so I/O
      always makes some progress.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5d9a0df
    • M
      fuse: update file size on short read · 5c5c5e51
      Miklos Szeredi 提交于
      If the READ request returned a short count, then either
      
        - cached size is incorrect
        - filesystem is buggy, as short reads are only allowed on EOF
      
      So assume that the size is wrong and refresh it, so that cached read() doesn't
      zero fill the missing chunk.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c5c5e51
    • N
      fuse: implement perform_write · ea9b9907
      Nick Piggin 提交于
      Introduce fuse_perform_write.  With fusexmp (a passthrough filesystem), large
      (1MB) writes into a backing tmpfs filesystem are sped up by almost 4 times
      (256MB/s vs 71MB/s).
      
      [mszeredi@suse.cz]:
      
       - split into smaller functions
       - testing
       - duplicate generic_file_aio_write(), so that there's no need to add a
         new ->perform_write() a_op.  Comment from hch.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea9b9907
    • M
      fuse: clean up setting i_size in write · 854512ec
      Miklos Szeredi 提交于
      Extract common code for setting i_size in write functions into a common
      helper.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      854512ec
    • M
      fuse: support writable mmap · 3be5a52b
      Miklos Szeredi 提交于
      Quoting Linus (3 years ago, FUSE inclusion discussions):
      
        "User-space filesystems are hard to get right. I'd claim that they
         are almost impossible, unless you limit them somehow (shared
         writable mappings are the nastiest part - if you don't have those,
         you can reasonably limit your problems by limiting the number of
         dirty pages you accept through normal "write()" calls)."
      
      Instead of attempting the impossible, I've just waited for the dirty page
      accounting infrastructure to materialize (thanks to Peter Zijlstra and
      others).  This nicely solved the biggest problem: limiting the number of pages
      used for write caching.
      
      Some small details remained, however, which this largish patch attempts to
      address.  It provides a page writeback implementation for fuse, which is
      completely safe against VM related deadlocks.  Performance may not be very
      good for certain usage patterns, but generally it should be acceptable.
      
      It has been tested extensively with fsx-linux and bash-shared-mapping.
      
      Fuse page writeback design
      --------------------------
      
      fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM.
      It copies the contents of the original page, and queues a WRITE request to the
      userspace filesystem using this temp page.
      
      The writeback is finished instantly from the MM's point of view: the page is
      removed from the radix trees, and the PageDirty and PageWriteback flags are
      cleared.
      
      For the duration of the actual write, the NR_WRITEBACK_TEMP counter is
      incremented.  The per-bdi writeback count is not decremented until the actual
      write completes.
      
      On dirtying the page, fuse waits for a previous write to finish before
      proceeding.  This makes sure, there can only be one temporary page used at a
      time for one cached page.
      
      This approach is wasteful in both memory and CPU bandwidth, so why is this
      complication needed?
      
      The basic problem is that there can be no guarantee about the time in which
      the userspace filesystem will complete a write.  It may be buggy or even
      malicious, and fail to complete WRITE requests.  We don't want unrelated parts
      of the system to grind to a halt in such cases.
      
      Also a filesystem may need additional resources (particularly memory) to
      complete a WRITE request.  There's a great danger of a deadlock if that
      allocation may wait for the writepage to finish.
      
      Currently there are several cases where the kernel can block on page
      writeback:
      
        - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER
        - page migration
        - throttle_vm_writeout (through NR_WRITEBACK)
        - sync(2)
      
      Of course in some cases (fsync, msync) we explicitly want to allow blocking.
      So for these cases new code has to be added to fuse, since the VM is not
      tracking writeback pages for us any more.
      
      As an extra safetly measure, the maximum dirty ratio allocated to a single
      fuse filesystem is set to 1% by default.  This way one (or several) buggy or
      malicious fuse filesystems cannot slow down the rest of the system by hogging
      dirty memory.
      
      With appropriate privileges, this limit can be raised through
      '/sys/class/bdi/<bdi>/max_ratio'.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3be5a52b
    • M
      mm: bdi: expose the BDI object in sysfs for FUSE · b6f2fcbc
      Miklos Szeredi 提交于
      Register FUSE's backing_dev_info under sysfs with the name "fuse-MAJOR:MINOR"
      
      Make the fuse control filesystem use s_dev instead of a fuse specific ID.
      This makes it easier to match directories under /sys/fs/fuse/connections/ with
      directories under /sys/class/bdi, and with actual mounts.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b6f2fcbc
  13. 25 4月, 2008 1 次提交
  14. 24 2月, 2008 1 次提交
  15. 09 2月, 2008 1 次提交
  16. 08 2月, 2008 2 次提交
  17. 07 2月, 2008 2 次提交