提交 · 7118883b44b8edfea732aadeb0d4424da3f152b2 · openeuler / Kernel

01 10月, 2018 4 次提交

fuse: use mtime for readdir cache verification · 7118883b

由 Miklos Szeredi 提交于 10月 01, 2018

Store the modification time of the directory in the cache, obtained before
starting to fill the cache.

When reading the cache, verify that the directory hasn't changed, by
checking if current modification time is the same as the one stored in the
cache.

This only needs to be done when the current file position is at the
beginning of the directory, as mandated by POSIX.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

7118883b

fuse: add readdir cache version · 3494927e

由 Miklos Szeredi 提交于 10月 01, 2018

Allow the cache to be invalidated when page(s) have gone missing. In this
case increment the version of the cache and reset to an empty state.

Add a version number to the directory stream in struct fuse_file as well,
indicating the version of the cache it's supposed to be reading. If the
cache version doesn't match the stream's version, then reset the stream to
the beginning of the cache.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3494927e

fuse: allow using readdir cache · 5d7bc7e8

由 Miklos Szeredi 提交于 10月 01, 2018

The cache is only used if it's completed, not while it's still being
filled; this constraint could be lifted later, if it turns out to be
useful.

Introduce state in struct fuse_file that indicates the position within the
cache.  After a seek, reset the position to the beginning of the cache and
search the cache for the current position.  If the current position is not
found in the cache, then fall back to uncached readdir.

It can also happen that page(s) disappear from the cache, in which case we
must also fall back to uncached readdir.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5d7bc7e8

fuse: allow caching readdir · 69e34551

由 Miklos Szeredi 提交于 10月 01, 2018

This patch just adds the cache filling functions, which are invoked if
FOPEN_CACHE_DIR flag is set in the OPENDIR reply.

Cache reading and cache invalidation are added by subsequent patches.

The directory cache uses the page cache.  Directory entries are packed into
a page in the same format as in the READDIR reply.  A page only contains
whole entries, the space at the end of the page is cleared.  The page is
locked while being modified.

Multiple parallel readdirs on the same directory can fill the cache; the
only constraint is that continuity must be maintained (d_off of last entry
points to position of current entry).
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

69e34551

28 9月, 2018 6 次提交

fuse: split out readdir.c · d123d8e1

由 Miklos Szeredi 提交于 9月 28, 2018

Directory reading code is about to grow larger, so split it out from dir.c
into a new source file.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d123d8e1

fuse: Use hash table to link processing request · be2ff42c

由 Kirill Tkhai 提交于 9月 11, 2018

We noticed the performance bottleneck in FUSE running our Virtuozzo storage
over rdma. On some types of workload we observe 20% of times spent in
request_find() in profiler.  This function is iterating over long requests
list, and it scales bad.

The patch introduces hash table to reduce the number of iterations, we do
in this function. Hash generating algorithm is taken from hash_add()
function, while 256 lines table is used to store pending requests.  This
fixes problem and improves the performance.
Reported-by: NAlexey Kuznetsov <kuznet@virtuozzo.com>
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

be2ff42c

fuse: kill req->intr_unique · 3a5358d1

由 Kirill Tkhai 提交于 9月 11, 2018

This field is not needed after the previous patch, since we can easily
convert request ID to interrupt request ID and vice versa.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3a5358d1

fuse: do not take fc->lock in fuse_request_send_background() · 63825b4e

由 Kirill Tkhai 提交于 8月 27, 2018

Currently, we take fc->lock there only to check for fc->connected.
But this flag is changed only on connection abort, which is very
rare operation.

So allow checking fc->connected under just fc->bg_lock and use this lock
(as well as fc->lock) when resetting fc->connected.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

63825b4e

fuse: introduce fc->bg_lock · ae2dffa3

由 Kirill Tkhai 提交于 8月 27, 2018

To reduce contention of fc->lock, this patch introduces bg_lock for
protection of fields related to background queue. These are:
max_background, congestion_threshold, num_background, active_background,
bg_queue and blocked.

This allows next patch to make async reads not requiring fc->lock, so async
reads and writes will have better performance executed in parallel.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

ae2dffa3

fuse: add support for copy_file_range() · 88bc7d50

由 Niels de Vos 提交于 8月 21, 2018

There are several FUSE filesystems that can implement server-side copy
or other efficient copy/duplication/clone methods. The copy_file_range()
syscall is the standard interface that users have access to while not
depending on external libraries that bypass FUSE.
Signed-off-by: NNiels de Vos <ndevos@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

88bc7d50

26 7月, 2018 2 次提交

fuse: fix initial parallel dirops · 63576c13

由 Miklos Szeredi 提交于 7月 26, 2018

If parallel dirops are enabled in FUSE_INIT reply, then first operation may
leave fi->mutex held.
Reported-by: Nsyzbot <syzbot+3f7b29af1baa9d0a55be@syzkaller.appspotmail.com>
Fixes: 5c672ab3 ("fuse: serialize dirops by default")
Cc: <stable@vger.kernel.org> # v4.7
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

63576c13

fuse: umount should wait for all requests · b8f95e5d

由 Miklos Szeredi 提交于 7月 26, 2018

fuse_abort_conn() does not guarantee that all async requests have actually
finished aborting (i.e. their ->end() function is called).  This could
actually result in still used inodes after umount.

Add a helper to wait until all requests are fully done.  This is done by
looking at the "num_waiting" counter.  When this counter drops to zero, we
can be sure that no more requests are outstanding.

Fixes: 0d8e84b0 ("fuse: simplify request abort")
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

b8f95e5d

31 5月, 2018 1 次提交

fuse: Ensure posix acls are translated outside of init_user_ns · e45b2546

由 Eric W. Biederman 提交于 5月 04, 2018

Ensure the translation happens by failing to read or write
posix acls when the filesystem has not indicated it supports
posix acls.

This ensures that modern cached posix acl support is available
and used when dealing with posix acls.  This is important
because only that path has the code to convernt the uids and
gids in posix acls into the user namespace of a fuse filesystem.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e45b2546

21 3月, 2018 2 次提交

fuse: Support fuse filesystems outside of init_user_ns · 8cb08329

由 Eric W. Biederman 提交于 2月 21, 2018

In order to support mounts from namespaces other than init_user_ns, fuse
must translate uids and gids to/from the userns of the process servicing
requests on /dev/fuse. This patch does that, with a couple of restrictions
on the namespace:

 - The userns for the fuse connection is fixed to the namespace
   from which /dev/fuse is opened.

 - The namespace must be the same as s_user_ns.

These restrictions simplify the implementation by avoiding the need to pass
around userns references and by allowing fuse to rely on the checks in
setattr_prepare for ownership changes.  Either restriction could be relaxed
in the future if needed.

For cuse the userns used is the opener of /dev/cuse.  Semantically the cuse
support does not appear safe for unprivileged users.  Practically the
permissions on /dev/cuse only make it accessible to the global root user.
If something slips through the cracks in a user namespace the only users
who will be able to use the cuse device are those users mapped into the
user namespace.

Translation in the posix acl is updated to use the uuser namespace of the
filesystem.  Avoiding cases which might bypass this translation is handled
in a following change.

This change is stronlgy based on a similar change from Seth Forshee and
Dongsu Park.

Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8cb08329

fuse: return -ECONNABORTED on /dev/fuse read after abort · 3b7008b2

由 Szymon Lukasz 提交于 11月 09, 2017

Currently the userspace has no way of knowing whether the fuse
connection ended because of umount or abort via sysfs. It makes it hard
for filesystems to free the mountpoint after abort without worrying
about removing some new mount.

The patch fixes it by returning different errors when userspace reads
from /dev/fuse (-ENODEV for umount and -ECONNABORTED for abort).

Add a new capability flag FUSE_ABORT_ERROR. If set and the connection is
gone because of sysfs abort, reading from the device will return
-ECONNABORTED.
Signed-off-by: NSzymon Lukasz <noh4hss@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3b7008b2

28 11月, 2017 1 次提交
- A
  fs: annotate ->poll() instances · 076ccb76
  由 Al Viro 提交于 7月 03, 2017
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  076ccb76
12 9月, 2017 2 次提交

fuse: getattr cleanup · 5b97eeac

由 Miklos Szeredi 提交于 9月 12, 2017

The refreshed argument isn't used by any caller, get rid of it.

Use a helper for just updating the inode (no need to fill in a kstat).
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5b97eeac

fuse: honor iocb sync flags on write · e1c0eecb

由 Miklos Szeredi 提交于 9月 12, 2017

If the IOCB_DSYNC flag is set a sync is not being performed by
fuse_file_write_iter.

Honor IOCB_DSYNC/IOCB_SYNC by setting O_DYSNC/O_SYNC respectively in the
flags filed of the write request.

We don't need to sync data or metadata, since fuse_perform_write() does
write-through and the filesystem is responsible for updating file times.

Original patch by Vitaly Zolotusky.
Reported-by: NNate Clark <nate@neworld.us>
Cc: Vitaly Zolotusky <vitaly@unitc.com>.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e1c0eecb

03 8月, 2017 1 次提交

fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio · 61c12b49

由 Ashish Samant 提交于 7月 12, 2017

Commit 8fba54ae ("fuse: direct-io: don't dirty ITER_BVEC pages") fixes
the ITER_BVEC page deadlock for direct io in fuse by checking in
fuse_direct_io(), whether the page is a bvec page or not, before locking
it.  However, this check is missed when the "async_dio" mount option is
enabled.  In this case, set_page_dirty_lock() is called from the req->end
callback in request_end(), when the fuse thread is returning from userspace
to respond to the read request.  This will cause the same deadlock because
the bvec condition is not checked in this path.

Here is the stack of the deadlocked thread, while returning from userspace:

[13706.656686] INFO: task glusterfs:3006 blocked for more than 120 seconds.
[13706.657808] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[13706.658788] glusterfs       D ffffffff816c80f0     0  3006      1
0x00000080
[13706.658797]  ffff8800d6713a58 0000000000000086 ffff8800d9ad7000
ffff8800d9ad5400
[13706.658799]  ffff88011ffd5cc0 ffff8800d6710008 ffff88011fd176c0
7fffffffffffffff
[13706.658801]  0000000000000002 ffffffff816c80f0 ffff8800d6713a78
ffffffff816c790e
[13706.658803] Call Trace:
[13706.658809]  [<ffffffff816c80f0>] ? bit_wait_io_timeout+0x80/0x80
[13706.658811]  [<ffffffff816c790e>] schedule+0x3e/0x90
[13706.658813]  [<ffffffff816ca7e5>] schedule_timeout+0x1b5/0x210
[13706.658816]  [<ffffffff81073ffb>] ? gup_pud_range+0x1db/0x1f0
[13706.658817]  [<ffffffff810668fe>] ? kvm_clock_read+0x1e/0x20
[13706.658819]  [<ffffffff81066909>] ? kvm_clock_get_cycles+0x9/0x10
[13706.658822]  [<ffffffff810f5792>] ? ktime_get+0x52/0xc0
[13706.658824]  [<ffffffff816c6f04>] io_schedule_timeout+0xa4/0x110
[13706.658826]  [<ffffffff816c8126>] bit_wait_io+0x36/0x50
[13706.658828]  [<ffffffff816c7d06>] __wait_on_bit_lock+0x76/0xb0
[13706.658831]  [<ffffffffa0545636>] ? lock_request+0x46/0x70 [fuse]
[13706.658834]  [<ffffffff8118800a>] __lock_page+0xaa/0xb0
[13706.658836]  [<ffffffff810c8500>] ? wake_atomic_t_function+0x40/0x40
[13706.658838]  [<ffffffff81194d08>] set_page_dirty_lock+0x58/0x60
[13706.658841]  [<ffffffffa054d968>] fuse_release_user_pages+0x58/0x70 [fuse]
[13706.658844]  [<ffffffffa0551430>] ? fuse_aio_complete+0x190/0x190 [fuse]
[13706.658847]  [<ffffffffa0551459>] fuse_aio_complete_req+0x29/0x90 [fuse]
[13706.658849]  [<ffffffffa05471e9>] request_end+0xd9/0x190 [fuse]
[13706.658852]  [<ffffffffa0549126>] fuse_dev_do_write+0x336/0x490 [fuse]
[13706.658854]  [<ffffffffa054963e>] fuse_dev_write+0x6e/0xa0 [fuse]
[13706.658857]  [<ffffffff812a9ef3>] ? security_file_permission+0x23/0x90
[13706.658859]  [<ffffffff81205300>] do_iter_readv_writev+0x60/0x90
[13706.658862]  [<ffffffffa05495d0>] ? fuse_dev_splice_write+0x350/0x350
[fuse]
[13706.658863]  [<ffffffff812062a1>] do_readv_writev+0x171/0x1f0
[13706.658866]  [<ffffffff810b3d00>] ? try_to_wake_up+0x210/0x210
[13706.658868]  [<ffffffff81206361>] vfs_writev+0x41/0x50
[13706.658870]  [<ffffffff81206496>] SyS_writev+0x56/0xf0
[13706.658872]  [<ffffffff810257a1>] ? syscall_trace_leave+0xf1/0x160
[13706.658874]  [<ffffffff816cbb2e>] system_call_fastpath+0x12/0x71

Fix this by making should_dirty a fuse_io_priv parameter that can be
checked in fuse_aio_complete_req().
Reported-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NAshish Samant <ashish.samant@oracle.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

61c12b49

21 4月, 2017 2 次提交

fuse: Get rid of bdi_initialized · 7fbbe972

由 Jan Kara 提交于 4月 12, 2017

It is not needed anymore since bdi is initialized whenever superblock
exists.

CC: Miklos Szeredi <miklos@szeredi.hu>
CC: linux-fsdevel@vger.kernel.org
Suggested-by: NMiklos Szeredi <mszeredi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

7fbbe972

fuse: Convert to separately allocated bdi · 5f7f7543

由 Jan Kara 提交于 4月 12, 2017

Allocate struct backing_dev_info separately instead of embedding it
inside the superblock. This unifies handling of bdi among users.

CC: Miklos Szeredi <miklos@szeredi.hu>
CC: linux-fsdevel@vger.kernel.org
Acked-by: NMiklos Szeredi <mszeredi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

5f7f7543

18 4月, 2017 4 次提交

fuse: Add support for pid namespaces · 0b6e9ea0

由 Seth Forshee 提交于 7月 02, 2014

When the userspace process servicing fuse requests is running in
a pid namespace then pids passed via the fuse fd are not being
translated into that process' namespace. Translation is necessary
for the pid to be useful to that process.

Since no use case currently exists for changing namespaces all
translations can be done relative to the pid namespace in use
when fuse_conn_init() is called. For fuse this translates to
mount time, and for cuse this is when /dev/cuse is opened. IO for
this connection from another namespace will return errors.

Requests from processes whose pid cannot be translated into the
target namespace will have a value of 0 for in.h.pid.

File locking changes based on previous work done by Eric
Biederman.
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0b6e9ea0

fuse: convert fuse_conn.count from atomic_t to refcount_t · 095fc40a

由 Elena Reshetova 提交于 3月 03, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

095fc40a

fuse: convert fuse_req.count from atomic_t to refcount_t · ec99f6d3

由 Elena Reshetova 提交于 3月 03, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

ec99f6d3

fuse: convert fuse_file.count from atomic_t to refcount_t · 4e8c2eb5

由 Elena Reshetova 提交于 3月 03, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4e8c2eb5

23 2月, 2017 1 次提交

fuse: cleanup fuse_file refcounting · 267d8444

由 Miklos Szeredi 提交于 2月 22, 2017

struct fuse_file is stored in file->private_data.  Make this always be a
counting reference for consistency.

This also allows fuse_sync_release() to call fuse_file_put() instead of
partially duplicating its functionality.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

267d8444

14 1月, 2017 1 次提交

locking/atomic, kref: Add KREF_INIT() · 1e24edca

由 Peter Zijlstra 提交于 11月 14, 2016

Since we need to change the implementation, stop exposing internals.

Provide KREF_INIT() to allow static initialization of struct kref.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

1e24edca

18 10月, 2016 1 次提交

fuse: fix root dentry initialization · 0ce267ff

由 Miklos Szeredi 提交于 10月 18, 2016

Add missing dentry initialization to root dentry.

Fixes: f75fdf22 ("fuse: don't use ->d_time")
Reported-by: NAndreas Reis <andreas.reis@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0ce267ff

01 10月, 2016 4 次提交

fuse: Use generic xattr ops · 703c7362

由 Seth Forshee 提交于 8月 29, 2016

In preparation for posix acl support, rework fuse to use xattr handlers and
the generic setxattr/getxattr/listxattr callbacks.  Split the xattr code
out into it's own file, and promote symbols to module-global scope as
needed.

Functionally these changes have no impact, as fuse still uses a single
handler for all xattrs which uses the old callbacks.
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

703c7362

fuse: get rid of fc->flags · 29433a29

由 Miklos Szeredi 提交于 10月 01, 2016

Only two flags: "default_permissions" and "allow_other". All other flags
are handled via bitfields. So convert these two as well. They don't
change during the lifetime of the filesystem, so this is quite safe.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

29433a29

fuse: Add posix ACL support · 60bcc88a

由 Seth Forshee 提交于 8月 29, 2016

Add a new INIT flag, FUSE_POSIX_ACL, for negotiating ACL support with
userspace.  When it is set in the INIT response, ACL support will be
enabled.  ACL support also implies "default_permissions".

When ACL support is enabled, the kernel will cache and have responsibility
for enforcing ACLs.  ACL xattrs will be passed to userspace, which is
responsible for updating the ACLs in the filesystem, keeping the file mode
in sync, and inheritance of default ACLs when new filesystem nodes are
created.
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

60bcc88a

fuse: handle killpriv in userspace fs · 5e940c1d

由 Miklos Szeredi 提交于 10月 01, 2016

Only userspace filesystem can do the killing of suid/sgid without races.
So introduce an INIT flag and negotiate support for this.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5e940c1d

22 9月, 2016 1 次提交

fuse: Propagate dentry down to inode_change_ok() · 62490330

由 Jan Kara 提交于 5月 26, 2016

To avoid clearing of capabilities or security related extended
attributes too early, inode_change_ok() will need to take dentry instead
of inode. Propagate it down to fuse_do_setattr().
Acked-by: NMiklos Szeredi <mszeredi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

62490330

31 7月, 2016 1 次提交
- A
  qstr: constify instances in fuse · 13983d06
  由 Al Viro 提交于 7月 20, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  13983d06
30 6月, 2016 2 次提交

fuse: improve aio directIO write performance for size extending writes · 7879c4e5

由 Ashish Sangwan 提交于 4月 07, 2016

While sending the blocking directIO in fuse, the write request is broken
into sub-requests, each of default size 128k and all the requests are sent
in non-blocking background mode if async_dio mode is supported by libfuse.
The process which issue the write wait for the completion of all the
sub-requests. Sending multiple requests parallely gives a chance to perform
parallel writes in the user space fuse implementation if it is
multi-threaded and hence improves the performance.

When there is a size extending aio dio write, we switch to blocking mode so
that we can properly update the size of the file after completion of the
writes. However, in this situation all the sub-requests are sent in
serialized manner where the next request is sent only after receiving the
reply of the current request. Hence the multi-threaded user space
implementation is not utilized properly.

This patch changes the size extending aio dio behavior to exactly follow
blocking dio. For multi threaded fuse implementation having 10 threads and
using buffer size of 64MB to perform async directIO, we are getting double
the speed.
Signed-off-by: NAshish Sangwan <ashishsangwan2@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

7879c4e5

fuse: serialize dirops by default · 5c672ab3

由 Miklos Szeredi 提交于 6月 30, 2016

Negotiate with userspace filesystems whether they support parallel readdir
and lookup.  Disable parallelism by default for fear of breaking fuse
filesystems.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 9902af79 ("parallel lookups: actual switch to rwsem")
Fixes: d9b3dbdc ("fuse: switch to ->iterate_shared()")

5c672ab3

14 3月, 2016 1 次提交

fuse: Add reference counting for fuse_io_priv · 744742d6

由 Seth Forshee 提交于 3月 11, 2016

The 'reqs' member of fuse_io_priv serves two purposes. First is to track
the number of oustanding async requests to the server and to signal that
the io request is completed. The second is to be a reference count on the
structure to know when it can be freed.

For sync io requests these purposes can be at odds. fuse_direct_IO() wants
to block until the request is done, and since the signal is sent when
'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to
use the object after the userspace server has completed processing
requests. This leads to some handshaking and special casing that it
needlessly complicated and responsible for at least one race condition.

It's much cleaner and safer to maintain a separate reference count for the
object lifecycle and to let 'reqs' just be a count of outstanding requests
to the userspace server. Then we can know for sure when it is safe to free
the object without any handshaking or special cases.

The catch here is that most of the time these objects are stack allocated
and should not be freed. Initializing these objects with a single reference
that is never released prevents accidental attempts to free the objects.

Fixes: 9d5722b7 ("fuse: handle synchronous iocbs internally")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

744742d6

10 11月, 2015 1 次提交

fuse: add support for SEEK_HOLE and SEEK_DATA in lseek · 0b5da8db

由 Ravishankar N 提交于 6月 30, 2015

A useful performance improvement for accessing virtual machine images
via FUSE mount.

See https://bugzilla.redhat.com/show_bug.cgi?id=1220173 for a use-case
for glusterFS.
Signed-off-by: NRavishankar N <ravishankar@redhat.com>
Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>

0b5da8db

01 7月, 2015 2 次提交

fuse: separate pqueue for clones · c3696046

由 Miklos Szeredi 提交于 7月 01, 2015

Make each fuse device clone refer to a separate processing queue.  The only
constraint on userspace code is that the request answer must be written to
the same device clone as it was read off.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

c3696046

fuse: introduce per-instance fuse_dev structure · cc080e9e

由 Miklos Szeredi 提交于 7月 01, 2015

Allow fuse device clones to refer to be distinguished.  This patch just
adds the infrastructure by associating a separate "struct fuse_dev" with
each clone.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

cc080e9e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功