提交 · cc080e9e9be16ccf26135d366d7d2b65209f1d56 · openeuler / Kernel

01 7月, 2015 11 次提交

fuse: introduce per-instance fuse_dev structure · cc080e9e

由 Miklos Szeredi 提交于 7月 01, 2015

Allow fuse device clones to refer to be distinguished.  This patch just
adds the infrastructure by associating a separate "struct fuse_dev" with
each clone.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

cc080e9e

fuse: add req flag for private list · 77cd9d48

由 Miklos Szeredi 提交于 7月 01, 2015

When an unlocked request is aborted, it is moved from fpq->io to a private
list.  Then, after unlocking fpq->lock, the private list is processed and
the requests are finished off.

To protect the private list, we need to mark the request with a flag, so if
in the meantime the request is unlocked the list is not corrupted.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

77cd9d48

fuse: pqueue locking · 45a91cb1

由 Miklos Szeredi 提交于 7月 01, 2015

Add a fpq->lock for protecting members of struct fuse_pqueue and FR_LOCKED
request flag.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

45a91cb1

fuse: duplicate ->connected in pqueue · e96edd94

由 Miklos Szeredi 提交于 7月 01, 2015

This will allow checking ->connected just with the processing queue lock.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

e96edd94

fuse: separate out processing queue · 3a2b5b9c

由 Miklos Szeredi 提交于 7月 01, 2015

This is just two fields: fc->io and fc->processing.

This patch just rearranges the fields, no functional change.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

3a2b5b9c

fuse: duplicate ->connected in iqueue · e16714d8

由 Miklos Szeredi 提交于 7月 01, 2015

This will allow checking ->connected just with the input queue lock.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

e16714d8

fuse: separate out input queue · f88996a9

由 Miklos Szeredi 提交于 7月 01, 2015

The input queue contains normal requests (fc->pending), forgets
(fc->forget_*) and interrupts (fc->interrupts).  There's also fc->waitq and
fc->fasync for waking up the readers of the fuse device when a request is
available.

The fc->reqctr is also moved to the input queue (assigned to the request
when the request is added to the input queue.

This patch just rearranges the fields, no functional change.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

f88996a9

fuse: req state use flags · 33e14b4d

由 Miklos Szeredi 提交于 7月 01, 2015

Use flags for representing the state in fuse_req.  This is needed since
req->list will be protected by different locks in different states, hence
we'll want the state itself to be split into distinct bits, each protected
with the relevant lock in that state.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

33e14b4d

fuse: simplify req states · 7a3b2c75

由 Miklos Szeredi 提交于 7月 01, 2015

FUSE_REQ_INIT is actually the same state as FUSE_REQ_PENDING and
FUSE_REQ_READING and FUSE_REQ_WRITING can be merged into a common
FUSE_REQ_IO state.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

7a3b2c75

fuse: use per req lock for lock/unlock_request() · dc00809a

由 Miklos Szeredi 提交于 7月 01, 2015

Reuse req->waitq.lock for protecting FR_ABORTED and FR_LOCKED flags.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

dc00809a

fuse: req use bitops · 825d6d33

由 Miklos Szeredi 提交于 7月 01, 2015

Finer grained locking will mean there's no single lock to protect
modification of bitfileds in fuse_req.

So move to using bitops.  Can use the non-atomic variants for those which
happen while the request definitely has only one reference.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NAshish Samant <ashish.samant@oracle.com>

825d6d33

14 3月, 2015 1 次提交

fuse: handle synchronous iocbs internally · 9d5722b7

由 Christoph Hellwig 提交于 2月 02, 2015

Based on a patch from Maxim Patlasov <MPatlasov@parallels.com>.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9d5722b7

06 1月, 2015 1 次提交

fuse: add memory barrier to INIT · 9759bd51

由 Miklos Szeredi 提交于 1月 06, 2015

Theoretically we need to order setting of various fields in fc with
fc->initialized.

No known bug reports related to this yet.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

9759bd51

12 12月, 2014 4 次提交

fuse: introduce fuse_simple_request() helper · 7078187a

由 Miklos Szeredi 提交于 12月 12, 2014

The following pattern is repeated many times:

	req = fuse_get_req_nopages(fc);
	/* Initialize req->(in|out).args */
	fuse_request_send(fc, req);
	err = req->out.h.error;
	fuse_put_request(req);

Create a new replacement helper:

	/* Initialize args */
	err = fuse_simple_request(fc, &args);

In addition to reducing the code size, this will ease moving from the
complex arg-based to a simpler page-based I/O on the fuse device.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

7078187a

fuse: reduce max out args · f704dcb5

由 Miklos Szeredi 提交于 12月 12, 2014

The third out-arg is never actually used.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

f704dcb5

fuse: hold inode instead of path after release · baebccbe

由 Miklos Szeredi 提交于 12月 12, 2014

path_put() in release could trigger a DESTROY request in fuseblk.  The
possible deadlock was worked around by doing the path_put() with
schedule_work().

This complexity isn't needed if we just hold the inode instead of the path.
Since we now flush all requests before destroying the super block we can be
sure that all held inodes will be dropped.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

baebccbe

fuse: flush requests on umount · 580640ba

由 Miklos Szeredi 提交于 12月 12, 2014

Use fuse_abort_conn() instead of fuse_conn_kill() in fuse_put_super().
This flushes and aborts requests still on any queues.  But since we've
already reset fc->connected, those requests would not be useful anyway and
would be flushed when the fuse device is closed.

Next patches will rely on requests being flushed before the superblock is
destroyed.

Use fuse_abort_conn() in cuse_process_init_reply() too, since it makes no
difference there, and we can get rid of fuse_conn_kill().
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

580640ba

07 5月, 2014 1 次提交

fuse: pull iov_iter initializations up · d22a943f

由 Al Viro 提交于 3月 16, 2014

... to fuse_direct_{read,write}().  ->direct_IO() path uses the
iov_iter passed by the caller instead.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d22a943f

28 4月, 2014 4 次提交

fuse: add renameat2 support · 1560c974

由 Miklos Szeredi 提交于 4月 28, 2014

Support RENAME_EXCHANGE and RENAME_NOREPLACE flags on the userspace ABI.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

1560c974

fuse: allow ctime flushing to userspace · ab9e13f7

由 Maxim Patlasov 提交于 4月 28, 2014

The patch extends fuse_setattr_in, and extends the flush procedure
(fuse_flush_times()) called on ->write_inode() to send the ctime as well as
mtime.
Signed-off-by: NMaxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

ab9e13f7

fuse: add .write_inode · 1e18bda8

由 Miklos Szeredi 提交于 4月 28, 2014

...and flush mtime from this.  This allows us to use the kernel
infrastructure for writing out dirty metadata (mtime at this point, but
ctime in the next patches and also maybe atime).
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

1e18bda8

fuse: add __exit to fuse_ctl_cleanup · 7736e8cc

由 Fabian Frederick 提交于 4月 23, 2014

fuse_ctl_cleanup is only called by __exit fuse_exit
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

7736e8cc

02 4月, 2014 3 次提交

fuse: Fix O_DIRECT operations vs cached writeback misorder · ea8cd333

由 Pavel Emelyanov 提交于 10月 10, 2013

The problem is:

1. write cached data to a file
2. read directly from the same file (via another fd)

The 2nd operation may read stale data, i.e. the one that was in a file
before the 1st op. Problem is in how fuse manages writeback.

When direct op occurs the core kernel code calls filemap_write_and_wait
to flush all the cached ops in flight. But fuse acks the writeback right
after the ->writepages callback exits w/o waiting for the real write to
happen. Thus the subsequent direct op proceeds while the real writeback
is still in flight. This is a problem for backends that reorder operation.

Fix this by making the fuse direct IO callback explicitly wait on the
in-flight writeback to finish.
Signed-off-by: NMaxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

ea8cd333

fuse: Trust kernel i_mtime only · b0aa7606

由 Maxim Patlasov 提交于 12月 26, 2013

Let the kernel maintain i_mtime locally:
 - clear S_NOCMTIME
 - implement i_op->update_time()
 - flush mtime on fsync and last close
 - update i_mtime explicitly on truncate and fallocate

Fuse inode flag FUSE_I_MTIME_DIRTY serves as indication that local i_mtime
should be flushed to the server eventually.
Signed-off-by: NMaxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

b0aa7606

fuse: Connection bit for enabling writeback · d5cd66c5

由 Pavel Emelyanov 提交于 10月 10, 2013

Off (0) by default. Will be used in the next patches and will be turned
on at the very end.
Signed-off-by: NMaxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

d5cd66c5

23 1月, 2014 2 次提交

fuse: support clients that don't implement 'open' · 7678ac50

由 Andrew Gallagher 提交于 11月 05, 2013

open/release operations require userspace transitions to keep track
of the open count and to perform any FS-specific setup.  However,
for some purely read-only FSs which don't need to perform any setup
at open/release time, we can avoid the performance overhead of
calling into userspace for open/release calls.

This patch adds the necessary support to the fuse kernel modules to prevent
open/release operations from hitting in userspace. When the client returns
ENOSYS, we avoid sending the subsequent release to userspace, and also
remember this so that future opens also don't trigger a userspace
operation.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

7678ac50

fuse: don't invalidate attrs when not using atime · 451418fc

由 Andrew Gallagher 提交于 11月 05, 2013

Various read operations (e.g. readlink, readdir) invalidate the cached
attrs for atime changes.  This patch adds a new function
'fuse_invalidate_atime', which checks for a read-only super block and
avoids the attr invalidation in that case.
Signed-off-by: NAndrew Gallagher <andrewjcg@fb.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

451418fc

25 10月, 2013 2 次提交

fuse: rcu-delay freeing fuse_conn · dd3e2c55

由 Al Viro 提交于 10月 03, 2013

makes ->permission() and ->d_revalidate() safety in RCU mode independent
from vfsmount_lock.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dd3e2c55

vfs: introduce d_instantiate_no_diralias() · b70a80e7

由 Miklos Szeredi 提交于 10月 01, 2013

...which just returns -EBUSY if a directory alias would be created.

This is to be used by fuse mkdir to make sure that a buggy or malicious
userspace filesystem doesn't do anything nasty.  Previously fuse used a
private mutex for this purpose, which can now go away.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

b70a80e7

01 10月, 2013 2 次提交

fuse: writepages: handle same page rewrites · 8b284dc4

由 Miklos Szeredi 提交于 10月 01, 2013

As Maxim Patlasov pointed out, it's possible to get a dirty page while it's
copy is still under writeback, despite fuse_page_mkwrite() doing its thing
(direct IO).

This could result in two concurrent write request for the same offset, with
data corruption if they get mixed up.

To prevent this, fuse needs to check and delay such writes.  This
implementation does this by:

 1. check if page is still under writeout, if so create a new, single page
    secondary request for it

 2. chain this secondary request onto the in-flight request

 2/a. if a seconday request for the same offset was already chained to the
    in-flight request, then just copy the contents of the page and discard
    the new secondary request.  This makes sure that for each page will
    have at most two requests associated with it

 3. when the in-flight request finished, send off all secondary requests
    chained onto it
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

8b284dc4

fuse: readdirplus: fix RCU walk · 6314efee

由 Miklos Szeredi 提交于 10月 01, 2013

Doing dput(parent) is not valid in RCU walk mode.  In RCU mode it would
probably be okay to update the parent flags, but it's actually not
necessary most of the time...

So only set the FUSE_I_ADVISE_RDPLUS flag on the parent when the entry was
recently initialized by READDIRPLUS.

This is achieved by setting FUSE_I_INIT_RDPLUS on entries added by
READDIRPLUS and only dropping out of RCU mode if this flag is set.
FUSE_I_INIT_RDPLUS is cleared once the FUSE_I_ADVISE_RDPLUS flag is set in
the parent.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: stable@vger.kernel.org

6314efee

03 9月, 2013 1 次提交

fuse: hotfix truncate_pagecache() issue · 06a7c3c2

由 Maxim Patlasov 提交于 8月 30, 2013

The way how fuse calls truncate_pagecache() from fuse_change_attributes()
is completely wrong. Because, w/o i_mutex held, we never sure whether
'oldsize' and 'attr->size' are valid by the time of execution of
truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we
released fc->lock in the middle of fuse_change_attributes(), we completely
loose control of actions which may happen with given inode until we reach
truncate_pagecache. The list of potentially dangerous actions includes
mmap-ed reads and writes, ftruncate(2) and write(2) extending file size.

The typical outcome of doing truncate_pagecache() with outdated arguments
is data corruption from user point of view. This is (in some sense)
acceptable in cases when the issue is triggered by a change of the file on
the server (i.e. externally wrt fuse operation), but it is absolutely
intolerable in scenarios when a single fuse client modifies a file without
any external intervention. A real life case I discovered by fsx-linux
looked like this:

1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends
FUSE_SETATTR to the server synchronously, but before getting fc->lock ...
2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP
to the server synchronously, then calls fuse_change_attributes(). The
latter updates i_size, releases fc->lock, but before comparing oldsize vs
attr->size..
3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and
updating attributes and i_size, but now oldsize is equal to
outarg.attr.size because i_size has just been updated (step 2). Hence,
fuse_do_setattr() returns w/o calling truncate_pagecache().
4. As soon as ftruncate(2) completes, the user extends file size by
write(2) making a hole in the middle of file, then reads data from the hole
either by read(2) or mmap-ed read. The user expects to get zero data from
the hole, but gets stale data because truncate_pagecache() is not executed
yet.

The scenario above illustrates one side of the problem: not truncating the
page cache even though we should. Another side corresponds to truncating
page cache too late, when the state of inode changed significantly.
Theoretically, the following is possible:

1. As in the previous scenario fuse_dentry_revalidate() discovered that
i_size changed (due to our own fuse_do_setattr()) and is going to call
truncate_pagecache() for some 'new_size' it believes valid right now. But
by the time that particular truncate_pagecache() is called ...
2. fuse_do_setattr() returns (either having called truncate_pagecache() or
not -- it doesn't matter).
3. The file is extended either by write(2) or ftruncate(2) or fallocate(2).
4. mmap-ed write makes a page in the extended region dirty.

The result will be the lost of data user wrote on the fourth step.

The patch is a hotfix resolving the issue in a simplistic way: let's skip
dangerous i_size update and truncate_pagecache if an operation changing
file size is in progress. This simplistic approach looks correct for the
cases w/o external changes. And to handle them properly, more sophisticated
and intrusive techniques (e.g. NFS-like one) would be required. I'd like to
postpone it until the issue is well discussed on the mailing list(s).

Changed in v2:
- improved patch description to cover both sides of the issue.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: stable@vger.kernel.org

06a7c3c2

01 5月, 2013 1 次提交

fuse: add flag to turn on async direct IO · 60b9df7a

由 Miklos Szeredi 提交于 5月 01, 2013

Without async DIO write requests to a single file were always serialized.
With async DIO that's no longer the case.

So don't turn on async DIO by default for fear of breaking backward
compatibility.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

60b9df7a

18 4月, 2013 3 次提交

fuse: truncate file if async dio failed · efb9fa9e

由 Maxim Patlasov 提交于 12月 18, 2012

The patch improves error handling in fuse_direct_IO(): if we successfully
submitted several fuse requests on behalf of synchronous direct write
extending file and some of them failed, let's try to do our best to clean-up.

Changed in v2: reuse fuse_do_setattr(). Thanks to Brian for suggestion.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

efb9fa9e

fuse: make fuse_direct_io() aware about AIO · 36cf66ed

由 Maxim Patlasov 提交于 12月 14, 2012

The patch implements passing "struct fuse_io_priv *io" down the stack up to
fuse_send_read/write where it is used to submit request asynchronously.
io->async==0 designates synchronous processing.

Non-trivial part of the patch is changes in fuse_direct_io(): resources
like fuse requests and user pages cannot be released immediately in async
case.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

36cf66ed

fuse: add support of async IO · 01e9d11a

由 Maxim Patlasov 提交于 12月 14, 2012

The patch implements a framework to process an IO request asynchronously. The
idea is to associate several fuse requests with a single kiocb by means of
fuse_io_priv structure. The structure plays the same role for FUSE as 'struct
dio' for direct-io.c.

The framework is supposed to be used like this:
 - someone (who wants to process an IO asynchronously) allocates fuse_io_priv
   and initializes it setting 'async' field to non-zero value.
 - as soon as fuse request is filled, it can be submitted (in non-blocking way)
   by fuse_async_req_send()
 - when all submitted requests are ACKed by userspace, io->reqs drops to zero
   triggering aio_complete()

In case of IO initiated by libaio, aio_complete() will finish processing the
same way as in case of dio_complete() calling aio_complete(). But the
framework may be also used for internal FUSE use when initial IO request
was synchronous (from user perspective), but it's beneficial to process it
asynchronously. Then the caller should wait on kiocb explicitly and
aio_complete() will wake the caller up.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

01e9d11a

17 4月, 2013 2 次提交

fuse: add flag fc->initialized · 796523fb

由 Maxim Patlasov 提交于 3月 21, 2013

Existing flag fc->blocked is used to suspend request allocation both in case
of many background request submitted and period of time before init_reply
arrives from userspace. Next patch will skip blocking allocations of
synchronous request (disregarding fc->blocked). This is mostly OK, but
we still need to suspend allocations if init_reply is not arrived yet. The
patch introduces flag fc->initialized which will serve this purpose.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

796523fb

fuse: make request allocations for background processing explicit · 8b41e671

由 Maxim Patlasov 提交于 3月 21, 2013

There are two types of processing requests in FUSE: synchronous (via
fuse_request_send()) and asynchronous (via adding to fc->bg_queue).

Fortunately, the type of processing is always known in advance, at the time
of request allocation. This preparatory patch utilizes this fact making
fuse_get_req() aware about the type. Next patches will use it.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

8b41e671

07 2月, 2013 1 次提交

fuse: allow control of adaptive readdirplus use · 634734b6

由 Eric Wong 提交于 2月 06, 2013

For some filesystems (e.g. GlusterFS), the cost of performing a
normal readdir and readdirplus are identical.  Since adaptively
using readdirplus has no benefit for those systems, give
users/filesystems the option to control adaptive readdirplus use.

v2 of this patch incorporates Miklos's suggestion to simplify the code,
as well as improving consistency of macro names and documentation.
Signed-off-by: NEric Wong <normalperson@yhbt.net>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

634734b6

01 2月, 2013 1 次提交

FUSE: Adapt readdirplus to application usage patterns · 4582a4ab

由 Feng Shuo 提交于 1月 15, 2013

Use the same adaptive readdirplus mechanism as NFS:

http://permalink.gmane.org/gmane.linux.nfs/49299

If the user space implementation wants to disable readdirplus
temporarily, it could just return ENOTSUPP. Then kernel will
recall it with readdir.
Signed-off-by: NFeng Shuo <steve.shuo.feng@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

4582a4ab

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功