1. 17 7月, 2013 5 次提交
  2. 04 7月, 2013 1 次提交
  3. 29 6月, 2013 2 次提交
  4. 18 6月, 2013 1 次提交
    • M
      fuse: hold i_mutex in fuse_file_fallocate() · 14c14414
      Maxim Patlasov 提交于
      Changing size of a file on server and local update (fuse_write_update_size)
      should be always protected by inode->i_mutex. Otherwise a race like this is
      possible:
      
      1. Process 'A' calls fallocate(2) to extend file (~FALLOC_FL_KEEP_SIZE).
      fuse_file_fallocate() sends FUSE_FALLOCATE request to the server.
      2. Process 'B' calls ftruncate(2) shrinking the file. fuse_do_setattr()
      sends shrinking FUSE_SETATTR request to the server and updates local i_size
      by i_size_write(inode, outarg.attr.size).
      3. Process 'A' resumes execution of fuse_file_fallocate() and calls
      fuse_write_update_size(inode, offset + length). But 'offset + length' was
      obsoleted by ftruncate from previous step.
      
      Changed in v2 (thanks Brian and Anand for suggestions):
       - made relation between mutex_lock() and fuse_set_nowrite(inode) more
         explicit and clear.
       - updated patch description to use ftruncate(2) in example
      Signed-off-by: NMaxim V. Patlasov <MPatlasov@parallels.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      14c14414
  5. 03 6月, 2013 3 次提交
    • M
      fuse: fix alignment in short read optimization for async_dio · e5c5f05d
      Maxim Patlasov 提交于
      The bug was introduced with async_dio feature: trying to optimize short reads,
      we cut number-of-bytes-to-read to i_size boundary. Hence the following example:
      
      	truncate --size=300 /mnt/file
      	dd if=/mnt/file of=/dev/null iflag=direct
      
      led to FUSE_READ request of 300 bytes size. This turned out to be problem
      for userspace fuse implementations who rely on assumption that kernel fuse
      does not change alignment of request from client FS.
      
      The patch turns off the optimization if async_dio is disabled. And, if it's
      enabled, the patch fixes adjustment of number-of-bytes-to-read to preserve
      alignment.
      
      Note, that we cannot throw out short read optimization entirely because
      otherwise a direct read of a huge size issued on a tiny file would generate
      a huge amount of fuse requests and most of them would be ACKed by userspace
      with zero bytes read.
      Signed-off-by: NMaxim Patlasov <MPatlasov@parallels.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      e5c5f05d
    • B
      fuse: return -EIOCBQUEUED from fuse_direct_IO() for all async requests · c9ecf989
      Brian Foster 提交于
      If request submission fails for an async request (i.e.,
      get_user_pages() returns -ERESTARTSYS), we currently skip the
      -EIOCBQUEUED return and drop into wait_for_sync_kiocb() forever.
      
      Avoid this by always returning -EIOCBQUEUED for async requests. If
      an error occurs, the error is passed into fuse_aio_complete(),
      returned via aio_complete() and thus propagated to userspace via
      io_getevents().
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NMaxim Patlasov <MPatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      c9ecf989
    • M
      fuse: fix readdirplus Oops in fuse_dentry_revalidate · 28420dad
      Miklos Szeredi 提交于
      Fix bug introduced by commit 4582a4ab "FUSE: Adapt readdirplus to application
      usage patterns".
      
      We need to check for a positive dentry; negative dentries are not added by
      readdirplus.  Secondly we need to advise the use of readdirplus on the *parent*,
      otherwise the whole thing is useless.  Thirdly all this is only relevant if
      "readdirplus_auto" mode is selected by the filesystem.
      
      We advise the use of readdirplus only if the dentry was still valid.  If we had
      to redo the lookup then there was no use in doing the -plus version.
      Reported-by: NBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      CC: Feng Shuo <steve.shuo.feng@gmail.com>
      CC: stable@vger.kernel.org
      28420dad
  6. 20 5月, 2013 2 次提交
    • B
      fuse: update inode size and invalidate attributes on fallocate · bee6c307
      Brian Foster 提交于
      An fallocate request without FALLOC_FL_KEEP_SIZE set can extend the
      size of a file. Update the inode size after a successful fallocate.
      
      Also invalidate the inode attributes after a successful fallocate
      to ensure we pick up the latest attribute values (i.e., i_blocks).
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      bee6c307
    • B
      fuse: truncate pagecache range on hole punch · 3634a632
      Brian Foster 提交于
      fuse supports hole punch via the fallocate() FALLOC_FL_PUNCH_HOLE
      interface. When a hole punch is passed through, the page cache
      is not cleared and thus allows reading stale data from the cache.
      
      This is easily demonstrable (using FOPEN_KEEP_CACHE) by reading a
      smallish random data file into cache, punching a hole and creating
      a copy of the file. Drop caches or remount and observe that the
      original file no longer matches the file copied after the hole
      punch. The original file contains a zeroed range and the latter
      file contains stale data.
      
      Protect against writepage requests in progress and punch out the
      associated page cache range after a successful client fs hole
      punch.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      3634a632
  7. 15 5月, 2013 1 次提交
  8. 08 5月, 2013 1 次提交
  9. 01 5月, 2013 1 次提交
  10. 18 4月, 2013 7 次提交
    • M
      fuse: truncate file if async dio failed · efb9fa9e
      Maxim Patlasov 提交于
      The patch improves error handling in fuse_direct_IO(): if we successfully
      submitted several fuse requests on behalf of synchronous direct write
      extending file and some of them failed, let's try to do our best to clean-up.
      
      Changed in v2: reuse fuse_do_setattr(). Thanks to Brian for suggestion.
      Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      efb9fa9e
    • M
      fuse: optimize short direct reads · 439ee5f0
      Maxim Patlasov 提交于
      If user requested direct read beyond EOF, we can skip sending fuse requests
      for positions beyond EOF because userspace would ACK them with zero bytes read
      anyway. We can trust to i_size in fuse_direct_IO for such cases because it's
      called from fuse_file_aio_read() and the latter updates fuse attributes
      including i_size.
      Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      439ee5f0
    • M
      fuse: enable asynchronous processing direct IO · bcba24cc
      Maxim Patlasov 提交于
      In case of synchronous DIO request (i.e. read(2) or write(2) for a file
      opened with O_DIRECT), the patch submits fuse requests asynchronously, but
      waits for their completions before return from fuse_direct_IO().
      
      In case of asynchronous DIO request (i.e. libaio io_submit() or a file opened
      with O_DIRECT), the patch submits fuse requests asynchronously and return
      -EIOCBQUEUED immediately.
      
      The only special case is async DIO extending file. Here the patch falls back
      to old behaviour because we can't return -EIOCBQUEUED and update i_size later,
      without i_mutex hold. And we have no method to wait on real async I/O
      requests.
      
      The patch also clean __fuse_direct_write() up: it's better to update i_size
      in its callers. Thanks Brian for suggestion.
      Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      bcba24cc
    • M
      fuse: make fuse_direct_io() aware about AIO · 36cf66ed
      Maxim Patlasov 提交于
      The patch implements passing "struct fuse_io_priv *io" down the stack up to
      fuse_send_read/write where it is used to submit request asynchronously.
      io->async==0 designates synchronous processing.
      
      Non-trivial part of the patch is changes in fuse_direct_io(): resources
      like fuse requests and user pages cannot be released immediately in async
      case.
      Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      36cf66ed
    • M
      fuse: add support of async IO · 01e9d11a
      Maxim Patlasov 提交于
      The patch implements a framework to process an IO request asynchronously. The
      idea is to associate several fuse requests with a single kiocb by means of
      fuse_io_priv structure. The structure plays the same role for FUSE as 'struct
      dio' for direct-io.c.
      
      The framework is supposed to be used like this:
       - someone (who wants to process an IO asynchronously) allocates fuse_io_priv
         and initializes it setting 'async' field to non-zero value.
       - as soon as fuse request is filled, it can be submitted (in non-blocking way)
         by fuse_async_req_send()
       - when all submitted requests are ACKed by userspace, io->reqs drops to zero
         triggering aio_complete()
      
      In case of IO initiated by libaio, aio_complete() will finish processing the
      same way as in case of dio_complete() calling aio_complete(). But the
      framework may be also used for internal FUSE use when initial IO request
      was synchronous (from user perspective), but it's beneficial to process it
      asynchronously. Then the caller should wait on kiocb explicitly and
      aio_complete() will wake the caller up.
      Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      01e9d11a
    • M
      fuse: move fuse_release_user_pages() up · 187c5c36
      Maxim Patlasov 提交于
      fuse_release_user_pages() will be indirectly used by fuse_send_read/write
      in future patches.
      Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      187c5c36
    • M
      fuse: optimize wake_up · 3c18ef81
      Miklos Szeredi 提交于
      Normally blocked_waitq will be inactive, so optimize this case.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      3c18ef81
  11. 17 4月, 2013 4 次提交
  12. 10 4月, 2013 2 次提交
  13. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  14. 28 2月, 2013 1 次提交
  15. 26 2月, 2013 1 次提交
  16. 23 2月, 2013 1 次提交
  17. 07 2月, 2013 1 次提交
    • E
      fuse: allow control of adaptive readdirplus use · 634734b6
      Eric Wong 提交于
      For some filesystems (e.g. GlusterFS), the cost of performing a
      normal readdir and readdirplus are identical.  Since adaptively
      using readdirplus has no benefit for those systems, give
      users/filesystems the option to control adaptive readdirplus use.
      
      v2 of this patch incorporates Miklos's suggestion to simplify the code,
      as well as improving consistency of macro names and documentation.
      Signed-off-by: NEric Wong <normalperson@yhbt.net>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      634734b6
  18. 04 2月, 2013 3 次提交
  19. 01 2月, 2013 2 次提交