提交 · 43a2898631a8beee66c1d64c1e860f43d96b2e91 · openeuler / Kernel

27 11月, 2019 2 次提交

fuse: fix Kconfig indentation · 8d66fcb7

由 Krzysztof Kozlowski 提交于 11月 20, 2019

Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig
Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8d66fcb7

fuse: fix leak of fuse_io_priv · f1ebdeff

由 Miklos Szeredi 提交于 11月 25, 2019

exit_aio() is sometimes stuck in wait_for_completion() after aio is issued
with direct IO and the task receives a signal.

The reason is failure to call ->ki_complete() due to a leaked reference to
fuse_io_priv.  This happens in fuse_async_req_send() if
fuse_simple_background() returns an error (e.g. -EINTR).

In this case the error value is propagated via io->err, so return success
to not confuse callers.

This issue is tracked as a virtio-fs issue:
https://gitlab.com/virtio-fs/qemu/issues/14Reported-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Fixes: 45ac96ed ("fuse: convert direct_io to simple api")
Cc: <stable@vger.kernel.org> # v5.4
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

f1ebdeff

22 11月, 2019 3 次提交

virtiofs: Use completions while waiting for queue to be drained · 724c15a4

由 Vivek Goyal 提交于 10月 30, 2019

While we wait for queue to finish draining, use completions instead of
usleep_range(). This is better way of waiting for event.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

724c15a4

virtiofs: Do not send forget request "struct list_head" element · 1efcf39e

由 Vivek Goyal 提交于 10月 30, 2019

We are sending whole of virtio_fs_forget struct to the other end over
virtqueue. Other end does not need to see elements like "struct list".
That's internal detail of guest kernel. Fix it.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1efcf39e

virtiofs: Use a common function to send forget · 58ada94f

由 Vivek Goyal 提交于 10月 30, 2019

Currently we are duplicating logic to send forgets at two
places. Consolidate the code by calling one helper function.

This also uses virtqueue_add_outbuf() instead of
virtqueue_add_sgs(). Former is simpler to call.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

58ada94f

16 11月, 2019 1 次提交

pipe: Allow pipes to have kernel-reserved slots · 6718b6f8

由 David Howells 提交于 10月 16, 2019

Split pipe->ring_size into two numbers:

 (1) pipe->ring_size - indicates the hard size of the pipe ring.

 (2) pipe->max_usage - indicates the maximum number of pipe ring slots that
     userspace orchestrated events can fill.

This allows for a pipe that is both writable by the general kernel
notification facility and by userspace, allowing plenty of ring space for
notifications to be added whilst preventing userspace from being able to
pin too much unswappable kernel space.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6718b6f8

12 11月, 2019 4 次提交

virtiofs: Fix old-style declaration · 00929447

由 YueHaibing 提交于 11月 11, 2019

There expect the 'static' keyword to come first in a declaration, and we
get warnings like this with "make W=1":

fs/fuse/virtio_fs.c:687:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
fs/fuse/virtio_fs.c:692:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
fs/fuse/virtio_fs.c:1029:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

00929447

fuse: verify nlink · c634da71

由 Miklos Szeredi 提交于 11月 12, 2019

When adding a new hard link, make sure that i_nlink doesn't overflow.

Fixes: ac45d613 ("fuse: fix nlink after unlink")
Cc: <stable@vger.kernel.org> # v3.4
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c634da71

fuse: verify write return · 8aab336b

由 Miklos Szeredi 提交于 11月 12, 2019

Make sure filesystem is not returning a bogus number of bytes written.

Fixes: ea9b9907 ("fuse: implement perform_write")
Cc: <stable@vger.kernel.org> # v2.6.26
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8aab336b

fuse: verify attributes · eb59bd17

由 Miklos Szeredi 提交于 11月 12, 2019

If a filesystem returns negative inode sizes, future reads on the file were
causing the cpu to spin on truncate_pagecache.

Create a helper to validate the attributes.  This now does two things:

 - check the file mode
 - check if the file size fits in i_size without overflowing
Reported-by: NArijit Banerjee <arijit@rubrik.com>
Fixes: d8a5ba45 ("[PATCH] FUSE - core")
Cc: <stable@vger.kernel.org> # v2.6.14
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

eb59bd17

31 10月, 2019 1 次提交

pipe: Use head and tail pointers for the ring, not cursor and length · 8cefc107

由 David Howells 提交于 11月 15, 2019

Convert pipes to use head and tail pointers for the buffer ring rather than
pointer and length as the latter requires two atomic ops to update (or a
combined op) whereas the former only requires one.

 (1) The head pointer is the point at which production occurs and points to
     the slot in which the next buffer will be placed.  This is equivalent
     to pipe->curbuf + pipe->nrbufs.

     The head pointer belongs to the write-side.

 (2) The tail pointer is the point at which consumption occurs.  It points
     to the next slot to be consumed.  This is equivalent to pipe->curbuf.

     The tail pointer belongs to the read-side.

 (3) head and tail are allowed to run to UINT_MAX and wrap naturally.  They
     are only masked off when the array is being accessed, e.g.:

	pipe->bufs[head & mask]

     This means that it is not necessary to have a dead slot in the ring as
     head == tail isn't ambiguous.

 (4) The ring is empty if "head == tail".

     A helper, pipe_empty(), is provided for this.

 (5) The occupancy of the ring is "head - tail".

     A helper, pipe_occupancy(), is provided for this.

 (6) The number of free slots in the ring is "pipe->ring_size - occupancy".

     A helper, pipe_space_for_user() is provided to indicate how many slots
     userspace may use.

 (7) The ring is full if "head - tail >= pipe->ring_size".

     A helper, pipe_full(), is provided for this.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

8cefc107

23 10月, 2019 5 次提交

compat_ioctl: move more drivers to compat_ptr_ioctl · 1832f2d8

由 Arnd Bergmann 提交于 9月 11, 2018

The .ioctl and .compat_ioctl file operations have the same prototype so
they can both point to the same function, which works great almost all
the time when all the commands are compatible.

One exception is the s390 architecture, where a compat pointer is only
31 bit wide, and converting it into a 64-bit pointer requires calling
compat_ptr(). Most drivers here will never run in s390, but since we now
have a generic helper for it, it's easy enough to use it consistently.

I double-checked all these drivers to ensure that all ioctl arguments
are used as pointers or are ignored, but are not interpreted as integer
values.
Acked-by: NJason Gunthorpe <jgg@mellanox.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: NDavid Sterba <dsterba@suse.com>
Acked-by: NDarren Hart (VMware) <dvhart@infradead.org>
Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

1832f2d8

fuse: redundant get_fuse_inode() calls in fuse_writepages_fill() · 091d1a72

由 Vasily Averin 提交于 8月 19, 2019

Currently fuse_writepages_fill() calls get_fuse_inode() few times with
the same argument.
Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

091d1a72

fuse: truncate pending writes on O_TRUNC · e4648309

由 Miklos Szeredi 提交于 10月 23, 2019

Make sure cached writes are not reordered around open(..., O_TRUNC), with
the obvious wrong results.

Fixes: 4d99ff8f ("fuse: Turn writeback cache on")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e4648309

fuse: flush dirty data/metadata before non-truncate setattr · b24e7598

由 Miklos Szeredi 提交于 10月 23, 2019

If writeback cache is enabled, then writes might get reordered with
chmod/chown/utimes.  The problem with this is that performing the write in
the fuse daemon might itself change some of these attributes.  In such case
the following sequence of operations will result in file ending up with the
wrong mode, for example:

  int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL);
  write (fd, "1", 1);
  fchown (fd, 0, 0);
  fchmod (fd, 04755);
  close (fd);

This patch fixes this by flushing pending writes before performing
chown/chmod/utimes.
Reported-by: NGiuseppe Scrivano <gscrivan@redhat.com>
Tested-by: NGiuseppe Scrivano <gscrivan@redhat.com>
Fixes: 4d99ff8f ("fuse: Turn writeback cache on")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

b24e7598

virtiofs: Remove set but not used variable 'fc' · 80da5a80

由 zhengbin 提交于 10月 23, 2019

Fixes gcc '-Wunused-but-set-variable' warning:

fs/fuse/virtio_fs.c: In function virtio_fs_wake_pending_and_unlock:
fs/fuse/virtio_fs.c:983:20: warning: variable fc set but not used [-Wunused-but-set-variable]

It is not used since commit 7ee1e2e6 ("virtiofs: No need to check
fpq->connected state")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: Nzhengbin <zhengbin13@huawei.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

80da5a80

21 10月, 2019 7 次提交

virtiofs: Retry request submission from worker context · a9bfd9dd

由 Vivek Goyal 提交于 10月 15, 2019

If regular request queue gets full, currently we sleep for a bit and
retrying submission in submitter's context. This assumes submitter is not
holding any spin lock. But this assumption is not true for background
requests. For background requests, we are called with fc->bg_lock held.

This can lead to deadlock where one thread is trying submission with
fc->bg_lock held while request completion thread has called
fuse_request_end() which tries to acquire fc->bg_lock and gets blocked. As
request completion thread gets blocked, it does not make further progress
and that means queue does not get empty and submitter can't submit more
requests.

To solve this issue, retry submission with the help of a worker, instead of
retrying in submitter's context. We already do this for hiprio/forget
requests.
Reported-by: NChirantan Ekbote <chirantan@chromium.org>
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

a9bfd9dd

virtiofs: Count pending forgets as in_flight forgets · c17ea009

由 Vivek Goyal 提交于 10月 15, 2019

If virtqueue is full, we put forget requests on a list and these forgets
are dispatched later using a worker. As of now we don't count these forgets
in fsvq->in_flight variable. This means when queue is being drained, we
have to have special logic to first drain these pending requests and then
wait for fsvq->in_flight to go to zero.

By counting pending forgets in fsvq->in_flight, we can get rid of special
logic and just wait for in_flight to go to zero. Worker thread will kick
and drain all the forgets anyway, leading in_flight to zero.

I also need similar logic for normal request queue in next patch where I am
about to defer request submission in the worker context if queue is full.

This simplifies the code a bit.

Also add two helper functions to inc/dec in_flight. Decrement in_flight
helper will later used to call completion when in_flight reaches zero.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c17ea009

virtiofs: Set FR_SENT flag only after request has been sent · 5dbe190f

由 Vivek Goyal 提交于 10月 15, 2019

FR_SENT flag should be set when request has been sent successfully sent
over virtqueue. This is used by interrupt logic to figure out if interrupt
request should be sent or not.

Also add it to fqp->processing list after sending it successfully.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5dbe190f

virtiofs: No need to check fpq->connected state · 7ee1e2e6

由 Vivek Goyal 提交于 10月 15, 2019

In virtiofs we keep per queue connected state in virtio_fs_vq->connected
and use that to end request if queue is not connected. And virtiofs does
not even touch fpq->connected state.

We probably need to merge these two at some point of time. For now,
simplify the code a bit and do not worry about checking state of
fpq->connected.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

7ee1e2e6

virtiofs: Do not end request in submission context · 51fecdd2

由 Vivek Goyal 提交于 10月 15, 2019

Submission context can hold some locks which end request code tries to hold
again and deadlock can occur. For example, fc->bg_lock. If a background
request is being submitted, it might hold fc->bg_lock and if we could not
submit request (because device went away) and tried to end request, then
deadlock happens. During testing, I also got a warning from deadlock
detection code.

So put requests on a list and end requests from a worker thread.

I got following warning from deadlock detector.

[  603.137138] WARNING: possible recursive locking detected
[  603.137142] --------------------------------------------
[  603.137144] blogbench/2036 is trying to acquire lock:
[  603.137149] 00000000f0f51107 (&(&fc->bg_lock)->rlock){+.+.}, at: fuse_request_end+0xdf/0x1c0 [fuse]
[  603.140701]
[  603.140701] but task is already holding lock:
[  603.140703] 00000000f0f51107 (&(&fc->bg_lock)->rlock){+.+.}, at: fuse_simple_background+0x92/0x1d0 [fuse]
[  603.140713]
[  603.140713] other info that might help us debug this:
[  603.140714]  Possible unsafe locking scenario:
[  603.140714]
[  603.140715]        CPU0
[  603.140716]        ----
[  603.140716]   lock(&(&fc->bg_lock)->rlock);
[  603.140718]   lock(&(&fc->bg_lock)->rlock);
[  603.140719]
[  603.140719]  *** DEADLOCK ***
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

51fecdd2

fuse: don't advise readdirplus for negative lookup · 6c26f717

由 Miklos Szeredi 提交于 10月 21, 2019

If the FUSE_READDIRPLUS_AUTO feature is enabled, then lookups on a
directory before/during readdir are used as an indication that READDIRPLUS
should be used instead of READDIR. However if the lookup turns out to be
negative, then selecting READDIRPLUS makes no sense.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

6c26f717

fuse: don't dereference req->args on finished request · 2b319d1f

由 Miklos Szeredi 提交于 10月 21, 2019

Move the check for async request after check for the request being already
finished and done with.

Reported-by: syzbot+ae0bb7aae3de6b4594e2@syzkaller.appspotmail.com
Fixes: d4993774 ("fuse: stop copying args to fuse_req")
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2b319d1f

15 10月, 2019 1 次提交

virtio-fs: don't show mount options · 3f22c746

由 Miklos Szeredi 提交于 10月 15, 2019

Virtio-fs does not accept any mount options, so it's confusing and wrong to
show any in /proc/mounts.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com> 
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3f22c746

14 10月, 2019 1 次提交

virtio-fs: Change module name to virtiofs.ko · 112e7237

由 Vivek Goyal 提交于 10月 11, 2019

We have been calling it virtio_fs and even file name is virtio_fs.c. Module
name is virtio_fs.ko but when registering file system user is supposed to
specify filesystem type as "virtiofs".

Masayoshi Mizuma reported that he specified filesytem type as "virtio_fs"
and got this warning on console.

  ------------[ cut here ]------------
  request_module fs-virtio_fs succeeded, but still no fs?
  WARNING: CPU: 1 PID: 1234 at fs/filesystems.c:274 get_fs_type+0x12c/0x138
  Modules linked in: ... virtio_fs fuse virtio_net net_failover ...
  CPU: 1 PID: 1234 Comm: mount Not tainted 5.4.0-rc1 #1

So looks like kernel could find the module virtio_fs.ko but could not find
filesystem type after that.

It probably is better to rename module name to virtiofs.ko so that above
warning goes away in case user ends up specifying wrong fs name.
Reported-by: NMasayoshi Mizuma <msys.mizuma@gmail.com>
Suggested-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Tested-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

112e7237

24 9月, 2019 7 次提交

fuse: Make fuse_args_to_req static · 5addcd5d

由 YueHaibing 提交于 9月 23, 2019

Fix sparse warning:

fs/fuse/dev.c:468:6: warning: symbol 'fuse_args_to_req' was not declared. Should it be static?
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Fixes: 68583165 ("fuse: add pages to fuse_args")
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5addcd5d

fuse: fix memleak in cuse_channel_open · 9ad09b19

由 zhengbin 提交于 8月 14, 2019

If cuse_send_init fails, need to fuse_conn_put cc->fc.

cuse_channel_open->fuse_conn_init->refcount_set(&fc->count, 1)
                 ->fuse_dev_alloc->fuse_conn_get
                 ->fuse_dev_free->fuse_conn_put

Fixes: cc080e9e ("fuse: introduce per-instance fuse_dev structure")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: Nzhengbin <zhengbin13@huawei.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

9ad09b19

fuse: fix beyond-end-of-page access in fuse_parse_cache() · e5854b1c

由 Tejun Heo 提交于 9月 22, 2019

With DEBUG_PAGEALLOC on, the following triggers.

  BUG: unable to handle page fault for address: ffff88859367c000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 3001067 P4D 3001067 PUD 406d3a8067 PMD 406d30c067 PTE 800ffffa6c983060
  Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
  CPU: 38 PID: 3110657 Comm: python2.7
  RIP: 0010:fuse_readdir+0x88f/0xe7a [fuse]
  Code: 49 8b 4d 08 49 39 4e 60 0f 84 44 04 00 00 48 8b 43 08 43 8d 1c 3c 4d 01 7e 68 49 89 dc 48 03 5c 24 38 49 89 46 60 8b 44 24 30 <8b> 4b 10 44 29 e0 48 89 ca 48 83 c1 1f 48 83 e1 f8 83 f8 17 49 89
  RSP: 0018:ffffc90035edbde0 EFLAGS: 00010286
  RAX: 0000000000001000 RBX: ffff88859367bff0 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffff88859367bfed RDI: 0000000000920907
  RBP: ffffc90035edbe90 R08: 000000000000014b R09: 0000000000000004
  R10: ffff88859367b000 R11: 0000000000000000 R12: 0000000000000ff0
  R13: ffffc90035edbee0 R14: ffff889fb8546180 R15: 0000000000000020
  FS:  00007f80b5f4a740(0000) GS:ffff889fffa00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff88859367c000 CR3: 0000001c170c2001 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   iterate_dir+0x122/0x180
   __x64_sys_getdents+0xa6/0x140
   do_syscall_64+0x42/0x100
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

It's in fuse_parse_cache().  %rbx (ffff88859367bff0) is fuse_dirent
pointer - addr + offset.  FUSE_DIRENT_SIZE() is trying to dereference
namelen off of it but that derefs into the next page which is disabled
by pagealloc debug causing a PF.

This is caused by dirent->namelen being accessed before ensuring that
there's enough bytes in the page for the dirent.  Fix it by pushing
down reclen calculation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Fixes: 5d7bc7e8 ("fuse: allow using readdir cache")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e5854b1c

fuse: unexport fuse_put_request · 0ed40593

由 Arnd Bergmann 提交于 9月 18, 2019

This function has been made static, which now causes a compile-time
warning:

WARNING: "fuse_put_request" [vmlinux] is a static EXPORT_SYMBOL_GPL

Remove the unneeded export.

Fixes: 66abc359 ("fuse: unexport request ops")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0ed40593

fuse: kmemcg account fs data · dc69e98c

由 Khazhismel Kumykov 提交于 9月 17, 2019

account per-file, dentry, and inode data

blockdev/superblock and temporary per-request data was left alone, as
this usually isn't accounted
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Signed-off-by: NKhazhismel Kumykov <khazhy@google.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

dc69e98c

fuse: on 64-bit store time in d_fsdata directly · 30c6a23d

由 Khazhismel Kumykov 提交于 9月 16, 2019

Implements the optimization noted in commit f75fdf22 ("fuse: don't
use ->d_time"), as the additional memory can be significant.  (In
particular, on SLAB configurations this 8-byte alloc becomes 32 bytes).
Per-dentry, this can consume significant memory.
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Signed-off-by: NKhazhismel Kumykov <khazhy@google.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

30c6a23d

fuse: fix missing unlock_page in fuse_writepage() · d5880c7a

由 Vasily Averin 提交于 9月 13, 2019

unlock_page() was missing in case of an already in-flight write against the
same page.
Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
Fixes: ff17be08 ("fuse: writepage: skip already in flight")
Cc: <stable@vger.kernel.org> # v3.13
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d5880c7a

19 9月, 2019 1 次提交

virtio-fs: add virtiofs filesystem · a62a8ef9

由 Stefan Hajnoczi 提交于 6月 12, 2018

Add a basic file system module for virtio-fs.  This does not yet contain
shared data support between host and guest or metadata coherency speedups.
However it is already significantly faster than virtio-9p.

Design Overview
===============

With the goal of designing something with better performance and local file
system semantics, a bunch of ideas were proposed.

 - Use fuse protocol (instead of 9p) for communication between guest and
   host.  Guest kernel will be fuse client and a fuse server will run on
   host to serve the requests.

 - For data access inside guest, mmap portion of file in QEMU address space
   and guest accesses this memory using dax.  That way guest page cache is
   bypassed and there is only one copy of data (on host).  This will also
   enable mmap(MAP_SHARED) between guests.

 - For metadata coherency, there is a shared memory region which contains
   version number associated with metadata and any guest changing metadata
   updates version number and other guests refresh metadata on next access.
   This is yet to be implemented.

How virtio-fs differs from existing approaches
==============================================

The unique idea behind virtio-fs is to take advantage of the co-location of
the virtual machine and hypervisor to avoid communication (vmexits).

DAX allows file contents to be accessed without communication with the
hypervisor.  The shared memory region for metadata avoids communication in
the common case where metadata is unchanged.

By replacing expensive communication with cheaper shared memory accesses,
we expect to achieve better performance than approaches based on network
file system protocols.  In addition, this also makes it easier to achieve
local file system semantics (coherency).

These techniques are not applicable to network file system protocols since
the communications channel is bypassed by taking advantage of shared memory
on a local machine.  This is why we decided to build virtio-fs rather than
focus on 9P or NFS.

Caching Modes
=============

Like virtio-9p, different caching modes are supported which determine the
coherency level as well.  The “cache=FOO” and “writeback” options control
the level of coherence between the guest and host filesystems.

 - cache=none
   metadata, data and pathname lookup are not cached in guest.  They are
   always fetched from host and any changes are immediately pushed to host.

 - cache=always
   metadata, data and pathname lookup are cached in guest and never expire.

 - cache=auto
   metadata and pathname lookup cache expires after a configured amount of
   time (default is 1 second).  Data is cached while the file is open
   (close to open consistency).

 - writeback/no_writeback
   These options control the writeback strategy.  If writeback is disabled,
   then normal writes will immediately be synchronized with the host fs.
   If writeback is enabled, then writes may be cached in the guest until
   the file is closed or an fsync(2) performed.  This option has no effect
   on mmap-ed writes or writes going through the DAX mechanism.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

a62a8ef9

12 9月, 2019 7 次提交

fuse: allow skipping control interface and forced unmount · 15c8e72e

由 Vivek Goyal 提交于 5月 06, 2019

virtio-fs does not support aborting requests which are being
processed. That is requests which have been sent to fuse daemon on host.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

15c8e72e

fuse: dissociate DESTROY from fuseblk · 783863d6

由 Miklos Szeredi 提交于 8月 29, 2019

Allow virtio-fs to also send DESTROY request.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

783863d6

fuse: delete dentry if timeout is zero · 8fab0106

由 Miklos Szeredi 提交于 8月 15, 2018

Don't hold onto dentry in lru list if need to re-lookup it anyway at next
access.  Only do this if explicitly enabled, otherwise it could result in
performance regression.

More advanced version of this patch would periodically flush out dentries
from the lru which have gone stale.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8fab0106

fuse: separate fuse device allocation and installation in fuse_conn · 0cd1eb9a

由 Vivek Goyal 提交于 3月 06, 2019

As of now fuse_dev_alloc() both allocates a fuse device and installs it in
fuse_conn list. fuse_dev_alloc() can fail if fuse_device allocation fails.

virtio-fs needs to initialize multiple fuse devices (one per virtio queue).
It initializes one fuse device as part of call to fuse_fill_super_common()
and rest of the devices are allocated and installed after that.

But, we can't afford to fail after calling fuse_fill_super_common() as we
don't have a way to undo all the actions done by fuse_fill_super_common().
So to avoid failures after the call to fuse_fill_super_common(),
pre-allocate all fuse devices early and install them into fuse connection
later.

This patch provides two separate helpers for fuse device allocation and
fuse device installation in fuse_conn.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0cd1eb9a

fuse: add fuse_iqueue_ops callbacks · ae3aad77

由 Stefan Hajnoczi 提交于 6月 18, 2018

The /dev/fuse device uses fiq->waitq and fasync to signal that requests are
available.  These mechanisms do not apply to virtio-fs.  This patch
introduces callbacks so alternative behavior can be used.

Note that queue_interrupt() changes along these lines:

  spin_lock(&fiq->waitq.lock);
  wake_up_locked(&fiq->waitq);
+ kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
  spin_unlock(&fiq->waitq.lock);
- kill_fasync(&fiq->fasync, SIGIO, POLL_IN);

Since queue_request() and queue_forget() also call kill_fasync() inside
the spinlock this should be safe.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

ae3aad77

fuse: extract fuse_fill_super_common() · 0cc2656c

由 Stefan Hajnoczi 提交于 6月 13, 2018

fuse_fill_super() includes code to process the fd= option and link the
struct fuse_dev to the fd's struct file.  In virtio-fs there is no file
descriptor because /dev/fuse is not used.

This patch extracts fuse_fill_super_common() so that both classic fuse and
virtio-fs can share the code to initialize a mount.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0cc2656c

fuse: export fuse_dequeue_forget() function · 4388c5aa

由 Vivek Goyal 提交于 6月 05, 2019

File systems like virtio-fs need to do not have to play directly with
forget list data structures. There is a helper function use that instead.

Rename dequeue_forget() to fuse_dequeue_forget() and export it so that
stacked filesystems can use it.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4388c5aa

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功