1. 30 6月, 2016 2 次提交
    • A
      fuse: improve aio directIO write performance for size extending writes · 7879c4e5
      Ashish Sangwan 提交于
      While sending the blocking directIO in fuse, the write request is broken
      into sub-requests, each of default size 128k and all the requests are sent
      in non-blocking background mode if async_dio mode is supported by libfuse.
      The process which issue the write wait for the completion of all the
      sub-requests. Sending multiple requests parallely gives a chance to perform
      parallel writes in the user space fuse implementation if it is
      multi-threaded and hence improves the performance.
      
      When there is a size extending aio dio write, we switch to blocking mode so
      that we can properly update the size of the file after completion of the
      writes. However, in this situation all the sub-requests are sent in
      serialized manner where the next request is sent only after receiving the
      reply of the current request. Hence the multi-threaded user space
      implementation is not utilized properly.
      
      This patch changes the size extending aio dio behavior to exactly follow
      blocking dio. For multi threaded fuse implementation having 10 threads and
      using buffer size of 64MB to perform async directIO, we are getting double
      the speed.
      Signed-off-by: NAshish Sangwan <ashishsangwan2@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      7879c4e5
    • M
      fuse: serialize dirops by default · 5c672ab3
      Miklos Szeredi 提交于
      Negotiate with userspace filesystems whether they support parallel readdir
      and lookup.  Disable parallelism by default for fear of breaking fuse
      filesystems.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 9902af79 ("parallel lookups: actual switch to rwsem")
      Fixes: d9b3dbdc ("fuse: switch to ->iterate_shared()")
      5c672ab3
  2. 14 3月, 2016 1 次提交
    • S
      fuse: Add reference counting for fuse_io_priv · 744742d6
      Seth Forshee 提交于
      The 'reqs' member of fuse_io_priv serves two purposes. First is to track
      the number of oustanding async requests to the server and to signal that
      the io request is completed. The second is to be a reference count on the
      structure to know when it can be freed.
      
      For sync io requests these purposes can be at odds.  fuse_direct_IO() wants
      to block until the request is done, and since the signal is sent when
      'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to
      use the object after the userspace server has completed processing
      requests. This leads to some handshaking and special casing that it
      needlessly complicated and responsible for at least one race condition.
      
      It's much cleaner and safer to maintain a separate reference count for the
      object lifecycle and to let 'reqs' just be a count of outstanding requests
      to the userspace server. Then we can know for sure when it is safe to free
      the object without any handshaking or special cases.
      
      The catch here is that most of the time these objects are stack allocated
      and should not be freed. Initializing these objects with a single reference
      that is never released prevents accidental attempts to free the objects.
      
      Fixes: 9d5722b7 ("fuse: handle synchronous iocbs internally")
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      744742d6
  3. 10 11月, 2015 1 次提交
  4. 01 7月, 2015 12 次提交
  5. 14 3月, 2015 1 次提交
  6. 06 1月, 2015 1 次提交
  7. 12 12月, 2014 4 次提交
    • M
      fuse: introduce fuse_simple_request() helper · 7078187a
      Miklos Szeredi 提交于
      The following pattern is repeated many times:
      
      	req = fuse_get_req_nopages(fc);
      	/* Initialize req->(in|out).args */
      	fuse_request_send(fc, req);
      	err = req->out.h.error;
      	fuse_put_request(req);
      
      Create a new replacement helper:
      
      	/* Initialize args */
      	err = fuse_simple_request(fc, &args);
      
      In addition to reducing the code size, this will ease moving from the
      complex arg-based to a simpler page-based I/O on the fuse device.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      7078187a
    • M
      fuse: reduce max out args · f704dcb5
      Miklos Szeredi 提交于
      The third out-arg is never actually used.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      f704dcb5
    • M
      fuse: hold inode instead of path after release · baebccbe
      Miklos Szeredi 提交于
      path_put() in release could trigger a DESTROY request in fuseblk.  The
      possible deadlock was worked around by doing the path_put() with
      schedule_work().
      
      This complexity isn't needed if we just hold the inode instead of the path.
      Since we now flush all requests before destroying the super block we can be
      sure that all held inodes will be dropped.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      baebccbe
    • M
      fuse: flush requests on umount · 580640ba
      Miklos Szeredi 提交于
      Use fuse_abort_conn() instead of fuse_conn_kill() in fuse_put_super().
      This flushes and aborts requests still on any queues.  But since we've
      already reset fc->connected, those requests would not be useful anyway and
      would be flushed when the fuse device is closed.
      
      Next patches will rely on requests being flushed before the superblock is
      destroyed.
      
      Use fuse_abort_conn() in cuse_process_init_reply() too, since it makes no
      difference there, and we can get rid of fuse_conn_kill().
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      580640ba
  8. 07 5月, 2014 1 次提交
  9. 28 4月, 2014 4 次提交
  10. 02 4月, 2014 3 次提交
  11. 23 1月, 2014 2 次提交
    • A
      fuse: support clients that don't implement 'open' · 7678ac50
      Andrew Gallagher 提交于
      open/release operations require userspace transitions to keep track
      of the open count and to perform any FS-specific setup.  However,
      for some purely read-only FSs which don't need to perform any setup
      at open/release time, we can avoid the performance overhead of
      calling into userspace for open/release calls.
      
      This patch adds the necessary support to the fuse kernel modules to prevent
      open/release operations from hitting in userspace. When the client returns
      ENOSYS, we avoid sending the subsequent release to userspace, and also
      remember this so that future opens also don't trigger a userspace
      operation.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      7678ac50
    • A
      fuse: don't invalidate attrs when not using atime · 451418fc
      Andrew Gallagher 提交于
      Various read operations (e.g. readlink, readdir) invalidate the cached
      attrs for atime changes.  This patch adds a new function
      'fuse_invalidate_atime', which checks for a read-only super block and
      avoids the attr invalidation in that case.
      Signed-off-by: NAndrew Gallagher <andrewjcg@fb.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      451418fc
  12. 25 10月, 2013 2 次提交
  13. 01 10月, 2013 2 次提交
    • M
      fuse: writepages: handle same page rewrites · 8b284dc4
      Miklos Szeredi 提交于
      As Maxim Patlasov pointed out, it's possible to get a dirty page while it's
      copy is still under writeback, despite fuse_page_mkwrite() doing its thing
      (direct IO).
      
      This could result in two concurrent write request for the same offset, with
      data corruption if they get mixed up.
      
      To prevent this, fuse needs to check and delay such writes.  This
      implementation does this by:
      
       1. check if page is still under writeout, if so create a new, single page
          secondary request for it
      
       2. chain this secondary request onto the in-flight request
      
       2/a. if a seconday request for the same offset was already chained to the
          in-flight request, then just copy the contents of the page and discard
          the new secondary request.  This makes sure that for each page will
          have at most two requests associated with it
      
       3. when the in-flight request finished, send off all secondary requests
          chained onto it
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      8b284dc4
    • M
      fuse: readdirplus: fix RCU walk · 6314efee
      Miklos Szeredi 提交于
      Doing dput(parent) is not valid in RCU walk mode.  In RCU mode it would
      probably be okay to update the parent flags, but it's actually not
      necessary most of the time...
      
      So only set the FUSE_I_ADVISE_RDPLUS flag on the parent when the entry was
      recently initialized by READDIRPLUS.
      
      This is achieved by setting FUSE_I_INIT_RDPLUS on entries added by
      READDIRPLUS and only dropping out of RCU mode if this flag is set.
      FUSE_I_INIT_RDPLUS is cleared once the FUSE_I_ADVISE_RDPLUS flag is set in
      the parent.
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: stable@vger.kernel.org
      6314efee
  14. 03 9月, 2013 1 次提交
    • M
      fuse: hotfix truncate_pagecache() issue · 06a7c3c2
      Maxim Patlasov 提交于
      The way how fuse calls truncate_pagecache() from fuse_change_attributes()
      is completely wrong. Because, w/o i_mutex held, we never sure whether
      'oldsize' and 'attr->size' are valid by the time of execution of
      truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we
      released fc->lock in the middle of fuse_change_attributes(), we completely
      loose control of actions which may happen with given inode until we reach
      truncate_pagecache. The list of potentially dangerous actions includes
      mmap-ed reads and writes, ftruncate(2) and write(2) extending file size.
      
      The typical outcome of doing truncate_pagecache() with outdated arguments
      is data corruption from user point of view. This is (in some sense)
      acceptable in cases when the issue is triggered by a change of the file on
      the server (i.e. externally wrt fuse operation), but it is absolutely
      intolerable in scenarios when a single fuse client modifies a file without
      any external intervention. A real life case I discovered by fsx-linux
      looked like this:
      
      1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends
      FUSE_SETATTR to the server synchronously, but before getting fc->lock ...
      2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP
      to the server synchronously, then calls fuse_change_attributes(). The
      latter updates i_size, releases fc->lock, but before comparing oldsize vs
      attr->size..
      3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and
      updating attributes and i_size, but now oldsize is equal to
      outarg.attr.size because i_size has just been updated (step 2). Hence,
      fuse_do_setattr() returns w/o calling truncate_pagecache().
      4. As soon as ftruncate(2) completes, the user extends file size by
      write(2) making a hole in the middle of file, then reads data from the hole
      either by read(2) or mmap-ed read. The user expects to get zero data from
      the hole, but gets stale data because truncate_pagecache() is not executed
      yet.
      
      The scenario above illustrates one side of the problem: not truncating the
      page cache even though we should. Another side corresponds to truncating
      page cache too late, when the state of inode changed significantly.
      Theoretically, the following is possible:
      
      1. As in the previous scenario fuse_dentry_revalidate() discovered that
      i_size changed (due to our own fuse_do_setattr()) and is going to call
      truncate_pagecache() for some 'new_size' it believes valid right now. But
      by the time that particular truncate_pagecache() is called ...
      2. fuse_do_setattr() returns (either having called truncate_pagecache() or
      not -- it doesn't matter).
      3. The file is extended either by write(2) or ftruncate(2) or fallocate(2).
      4. mmap-ed write makes a page in the extended region dirty.
      
      The result will be the lost of data user wrote on the fourth step.
      
      The patch is a hotfix resolving the issue in a simplistic way: let's skip
      dangerous i_size update and truncate_pagecache if an operation changing
      file size is in progress. This simplistic approach looks correct for the
      cases w/o external changes. And to handle them properly, more sophisticated
      and intrusive techniques (e.g. NFS-like one) would be required. I'd like to
      postpone it until the issue is well discussed on the mailing list(s).
      
      Changed in v2:
       - improved patch description to cover both sides of the issue.
      Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: stable@vger.kernel.org
      06a7c3c2
  15. 01 5月, 2013 1 次提交
  16. 18 4月, 2013 2 次提交