提交 · 8e4f3e15175ffab5d2126dc8e7c8cfcc1654a5aa · openeuler / Kernel

22 6月, 2021 3 次提交

fuse: fix illegal access to inode with reused nodeid · 15db1683

由 Amir Goldstein 提交于 6月 21, 2021

Server responds to LOOKUP and other ops (READDIRPLUS/CREATE/MKNOD/...)
with ourarg containing nodeid and generation.

If a fuse inode is found in inode cache with the same nodeid but different
generation, the existing fuse inode should be unhashed and marked "bad" and
a new inode with the new generation should be hashed instead.

This can happen, for example, with passhrough fuse filesystem that returns
the real filesystem ino/generation on lookup and where real inode numbers
can get recycled due to real files being unlinked not via the fuse
passthrough filesystem.

With current code, this situation will not be detected and an old fuse
dentry that used to point to an older generation real inode, can be used to
access a completely new inode, which should be accessed only via the new
dentry.

Note that because the FORGET message carries the nodeid w/o generation, the
server should wait to get FORGET counts for the nlookup counts of the old
and reused inodes combined, before it can free the resources associated to
that nodeid.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

15db1683

fuse: Switch to fc_mount() for submounts · 29e0e4df

由 Greg Kurz 提交于 6月 04, 2021

fc_mount() already handles the vfs_get_tree(), sb->s_umount
unlocking and vfs_create_mount() sequence. Using it greatly
simplifies fuse_dentry_automount().
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

29e0e4df

fuse: Call vfs_get_tree() for submounts · 266eb3f2

由 Greg Kurz 提交于 6月 04, 2021

We recently fixed an infinite loop by setting the SB_BORN flag on
submounts along with the write barrier needed by super_cache_count().
This is the job of vfs_get_tree() and FUSE shouldn't have to care
about the barrier at all.

Split out some code from fuse_dentry_automount() to the dedicated
fuse_get_tree_submount() handler for submounts and call vfs_get_tree().
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

266eb3f2

09 6月, 2021 3 次提交

fuse: Fix infinite loop in sget_fc() · e4a9ccdd

由 Greg Kurz 提交于 6月 04, 2021

We don't set the SB_BORN flag on submounts. This is wrong as these
superblocks are then considered as partially constructed or dying
in the rest of the code and can break some assumptions.

One such case is when you have a virtiofs filesystem with submounts
and you try to mount it again : virtio_fs_get_tree() tries to obtain
a superblock with sget_fc(). The logic in sget_fc() is to loop until
it has either found an existing matching superblock with SB_BORN set
or to create a brand new one. It is assumed that a superblock without
SB_BORN is transient and the loop is restarted. Forgetting to set
SB_BORN on submounts hence causes sget_fc() to retry forever.

Setting SB_BORN requires special care, i.e. a write barrier for
super_cache_count() which can check SB_BORN without taking any lock.
We should call vfs_get_tree() to deal with that but this requires
to have a proper ->get_tree() implementation for submounts, which
is a bigger piece of work. Go for a simple bug fix in the meatime.

Fixes: bf109c64 ("fuse: implement crossmounts")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e4a9ccdd

fuse: Fix crash if superblock of submount gets killed early · e3a43f2a

由 Greg Kurz 提交于 6月 04, 2021

As soon as fuse_dentry_automount() does up_write(&sb->s_umount), the
superblock can theoretically be killed. If this happens before the
submount was added to the &fc->mounts list, fuse_mount_remove() later
crashes in list_del_init() because it assumes the submount to be
already there.

Add the submount before dropping sb->s_umount to fix the inconsistency.
It is okay to nest fc->killsb under sb->s_umount, we already do this
on the ->kill_sb() path.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Fixes: bf109c64 ("fuse: implement crossmounts")
Cc: stable@vger.kernel.org # v5.10+
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e3a43f2a

fuse: Fix crash in fuse_dentry_automount() error path · d92d88f0

由 Greg Kurz 提交于 6月 04, 2021

If fuse_fill_super_submount() returns an error, the error path
triggers a crash:

[   26.206673] BUG: kernel NULL pointer dereference, address: 0000000000000000
[...]
[   26.226362] RIP: 0010:__list_del_entry_valid+0x25/0x90
[...]
[   26.247938] Call Trace:
[   26.248300]  fuse_mount_remove+0x2c/0x70 [fuse]
[   26.248892]  virtio_kill_sb+0x22/0x160 [virtiofs]
[   26.249487]  deactivate_locked_super+0x36/0xa0
[   26.250077]  fuse_dentry_automount+0x178/0x1a0 [fuse]

The crash happens because fuse_mount_remove() assumes that the FUSE
mount was already added to list under the FUSE connection, but this
only done after fuse_fill_super_submount() has returned success.

This means that until fuse_fill_super_submount() has returned success,
the FUSE mount isn't actually owned by the superblock. We should thus
reclaim ownership by clearing sb->s_fs_info, which will skip the call
to fuse_mount_remove(), and perform rollback, like virtio_fs_get_tree()
already does for the root sb.

Fixes: bf109c64 ("fuse: implement crossmounts")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d92d88f0

12 4月, 2021 2 次提交

fuse: convert to fileattr · 72227eac

由 Miklos Szeredi 提交于 4月 08, 2021

Since fuse just passes ioctl args through to/from server, converting to the
fileattr API is more involved, than most other filesystems.

Both .fileattr_set() and .fileattr_get() need to obtain an open file to
operate on.  The simplest way is with the following sequence:

  FUSE_OPEN
  FUSE_IOCTL
  FUSE_RELEASE

If this turns out to be a performance problem, it could be optimized for
the case when there's already a file (any file) open for the inode.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

72227eac

fuse: unsigned open flags · 54d601cb

由 Miklos Szeredi 提交于 4月 07, 2021

Release helpers used signed int.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

54d601cb

08 3月, 2021 1 次提交

new helper: inode_wrong_type() · 6e3e2c43

由 Al Viro 提交于 3月 01, 2021

inode_wrong_type(inode, mode) returns true if setting inode->i_mode
to given value would've changed the inode type.  We have enough of
those checks open-coded to make a helper worthwhile.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6e3e2c43

24 1月, 2021 4 次提交

fs: make helpers idmap mount aware · 549c7297

由 Christian Brauner 提交于 1月 21, 2021

Extend some inode methods with an additional user namespace argument. A
filesystem that is aware of idmapped mounts will receive the user
namespace the mount has been marked with. This can be used for
additional permission checking and also to enable filesystems to
translate between uids and gids if they need to. We have implemented all
relevant helpers in earlier patches.

As requested we simply extend the exisiting inode method instead of
introducing new ones. This is a little more code churn but it's mostly
mechanical and doesnt't leave us with additional inode methods.

Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>

549c7297

stat: handle idmapped mounts · 0d56a451

由 Christian Brauner 提交于 1月 21, 2021

The generic_fillattr() helper fills in the basic attributes associated
with an inode. Enable it to handle idmapped mounts. If the inode is
accessed through an idmapped mount map it into the mount's user
namespace before we store the uid and gid. If the initial user namespace
is passed nothing changes so non-idmapped mounts will see identical
behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-12-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Morris <jamorris@linux.microsoft.com>
Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>

0d56a451

attr: handle idmapped mounts · 2f221d6f

由 Christian Brauner 提交于 1月 21, 2021

When file attributes are changed most filesystems rely on the
setattr_prepare(), setattr_copy(), and notify_change() helpers for
initialization and permission checking. Let them handle idmapped mounts.
If the inode is accessed through an idmapped mount map it into the
mount's user namespace. Afterwards the checks are identical to
non-idmapped mounts. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.

Helpers that perform checks on the ia_uid and ia_gid fields in struct
iattr assume that ia_uid and ia_gid are intended values and have already
been mapped correctly at the userspace-kernelspace boundary as we
already do today. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>

2f221d6f

namei: make permission helpers idmapped mount aware · 47291baa

由 Christian Brauner 提交于 1月 21, 2021

The two helpers inode_permission() and generic_permission() are used by
the vfs to perform basic permission checking by verifying that the
caller is privileged over an inode. In order to handle idmapped mounts
we extend the two helpers with an additional user namespace argument.
On idmapped mounts the two helpers will make sure to map the inode
according to the mount's user namespace and then peform identical
permission checks to inode_permission() and generic_permission(). If the
initial user namespace is passed nothing changes so non-idmapped mounts
will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Morris <jamorris@linux.microsoft.com>
Acked-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>

47291baa

10 12月, 2020 1 次提交

fuse: fix bad inode · 5d069dbe

由 Miklos Szeredi 提交于 12月 10, 2020

Jan Kara's analysis of the syzbot report (edited):

  The reproducer opens a directory on FUSE filesystem, it then attaches
  dnotify mark to the open directory.  After that a fuse_do_getattr() call
  finds that attributes returned by the server are inconsistent, and calls
  make_bad_inode() which, among other things does:

          inode->i_mode = S_IFREG;

  This then confuses dnotify which doesn't tear down its structures
  properly and eventually crashes.

Avoid calling make_bad_inode() on a live inode: switch to a private flag on
the fuse inode.  Also add the test to ops which the bad_inode_ops would
have caught.

This bug goes back to the initial merge of fuse in 2.6.14...

Reported-by: syzbot+f427adf9324b92652ccc@syzkaller.appspotmail.com
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Tested-by: NJan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>

5d069dbe

12 11月, 2020 5 次提交

fuse: add a flag FUSE_OPEN_KILL_SUIDGID for open() request · 643a666a

由 Vivek Goyal 提交于 10月 09, 2020

With FUSE_HANDLE_KILLPRIV_V2 support, server will need to kill suid/sgid/
security.capability on open(O_TRUNC), if server supports
FUSE_ATOMIC_O_TRUNC.

But server needs to kill suid/sgid only if caller does not have CAP_FSETID.
Given server does not have this information, client needs to send this info
to server.

So add a flag FUSE_OPEN_KILL_SUIDGID to fuse_open_in request which tells
server to kill suid/sgid (only if group execute is set).

This flag is added to the FUSE_OPEN request, as well as the FUSE_CREATE
request if the create was non-exclusive, since that might result in an
existing file being opened/truncated.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

643a666a

fuse: don't send ATTR_MODE to kill suid/sgid for handle_killpriv_v2 · 8981bdfd

由 Vivek Goyal 提交于 10月 09, 2020

If client does a write() on a suid/sgid file, VFS will first call
fuse_setattr() with ATTR_KILL_S[UG]ID set. This requires sending setattr
to file server with ATTR_MODE set to kill suid/sgid. But to do that client
needs to know latest mode otherwise it is racy.

To reduce the race window, current code first call fuse_do_getattr() to get
latest ->i_mode and then resets suid/sgid bits and sends rest to server
with setattr(ATTR_MODE). This does not reduce the race completely but
narrows race window significantly.

With fc->handle_killpriv_v2 enabled, it should be possible to remove this
race completely. Do not kill suid/sgid with ATTR_MODE at all. It will be
killed by server when WRITE request is sent to server soon. This is
similar to fc->handle_killpriv logic. V2 is just more refined version of
protocol. Hence this patch does not send ATTR_MODE to kill suid/sgid if
fc->handle_killpriv_v2 is enabled.

This creates an issue if fc->writeback_cache is enabled. In that case
WRITE can be cached in guest and server might not see WRITE request and
hence will not kill suid/sgid. Miklos suggested that in such cases, we
should fallback to a writethrough WRITE instead and that will generate
WRITE request and kill suid/sgid. This patch implements that too.

But this relies on client seeing the suid/sgid set. If another client sets
suid/sgid and this client does not see it immideately, then we will not
fallback to writethrough WRITE. So this is one limitation with both
fc->handle_killpriv_v2 and fc->writeback_cache enabled. Both the options
are not fully compatible. But might be good enough for many use cases.

Note: This patch is not checking whether security.capability is set or not
when falling back to writethrough path. If suid/sgid is not set and
only security.capability is set, that will be taken care of by
file_remove_privs() call in ->writeback_cache path.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8981bdfd

fuse: setattr should set FATTR_KILL_SUIDGID · 31792161

由 Vivek Goyal 提交于 10月 09, 2020

If fc->handle_killpriv_v2 is enabled, we expect file server to clear
suid/sgid/security.capbility upon chown/truncate/write as appropriate.

Upon truncate (ATTR_SIZE), suid/sgid are cleared only if caller does not
have CAP_FSETID.  File server does not know whether caller has CAP_FSETID
or not.  Hence set FATTR_KILL_SUIDGID upon truncate to let file server know
that caller does not have CAP_FSETID and it should kill suid/sgid as
appropriate.

On chown (ATTR_UID/ATTR_GID) suid/sgid need to be cleared irrespective of
capabilities of calling process, so set FATTR_KILL_SUIDGID unconditionally
in that case.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

31792161

fuse: always revalidate if exclusive create · df8629af

由 Miklos Szeredi 提交于 11月 11, 2020

Failure to do so may result in EEXIST even if the file only exists in the
cache and not in the filesystem.

The atomic nature of O_EXCL mandates that the cached state should be
ignored and existence verified anew.
Reported-by: NKen Schalk <kschalk@nvidia.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

df8629af

fuse: get rid of fuse_mount refcount · 514b5e3f

由 Miklos Szeredi 提交于 11月 11, 2020

Fuse mount now only ever has a refcount of one (before being freed) so the
count field is unnecessary.

Remove the refcounting and fold fuse_mount_put() into callers.  The only
caller of fuse_mount_put() where fm->fc was NULL is fuse_dentry_automount()
and here the fuse_conn_put() can simply be omitted.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

514b5e3f

09 10月, 2020 1 次提交

fuse: implement crossmounts · bf109c64

由 Max Reitz 提交于 4月 21, 2020

FUSE servers can indicate crossmount points by setting FUSE_ATTR_SUBMOUNT
in fuse_attr.flags.  The inode will then be marked as S_AUTOMOUNT, and the
.d_automount implementation creates a new submount at that location, so
that the submount gets a distinct st_dev value.

Note that all submounts get a distinct superblock and a distinct st_dev
value, so for virtio-fs, even if the same filesystem is mounted more than
once on the host, none of its mount points will have the same st_dev.  We
need distinct superblocks because the superblock points to the root node,
but the different host mounts may show different trees (e.g. due to
submounts in some of them, but not in others).

Right now, this behavior is only enabled when fuse_conn.auto_submounts is
set, which is the case only for virtio-fs.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

bf109c64

18 9月, 2020 1 次提交

fuse: split fuse_mount off of fuse_conn · fcee216b

由 Max Reitz 提交于 5月 06, 2020

We want to allow submounts for the same fuse_conn, but with different
superblocks so that each of the submounts has its own device ID.  To do
so, we need to split all mount-specific information off of fuse_conn
into a new fuse_mount structure, so that multiple mounts can share a
single fuse_conn.

We need to take care only to perform connection-level actions once (i.e.
when the fuse_conn and thus the first fuse_mount are established, or
when the last fuse_mount and thus the fuse_conn are destroyed).  For
example, fuse_sb_destroy() must invoke fuse_send_destroy() until the
last superblock is released.

To do so, we keep track of which fuse_mount is the root mount and
perform all fuse_conn-level actions only when this fuse_mount is
involved.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

fcee216b

10 9月, 2020 1 次提交

virtiofs: serialize truncate/punch_hole and dax fault path · 6ae330ca

由 Vivek Goyal 提交于 8月 19, 2020

Currently in fuse we don't seem have any lock which can serialize fault
path with truncate/punch_hole path. With dax support I need one for
following reasons.

1. Dax requirement

  DAX fault code relies on inode size being stable for the duration of
  fault and want to serialize with truncate/punch_hole and they explicitly
  mention it.

  static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
                               const struct iomap_ops *ops)
        /*
         * Check whether offset isn't beyond end of file now. Caller is
         * supposed to hold locks serializing us with truncate / punch hole so
         * this is a reliable test.
         */
        max_pgoff = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);

2. Make sure there are no users of pages being truncated/punch_hole

  get_user_pages() might take references to page and then do some DMA
  to said pages. Filesystem might truncate those pages without knowing
  that a DMA is in progress or some I/O is in progress. So use
  dax_layout_busy_page() to make sure there are no such references
  and I/O is not in progress on said pages before moving ahead with
  truncation.

3. Limitation of kvm page fault error reporting

  If we are truncating file on host first and then removing mappings in
  guest lateter (truncate page cache etc), then this could lead to a
  problem with KVM. Say a mapping is in place in guest and truncation
  happens on host. Now if guest accesses that mapping, then host will
  take a fault and kvm will either exit to qemu or spin infinitely.

  IOW, before we do truncation on host, we need to make sure that guest
  inode does not have any mapping in that region or whole file.

4. virtiofs memory range reclaim

 Soon I will introduce the notion of being able to reclaim dax memory
 ranges from a fuse dax inode. There also I need to make sure that
 no I/O or fault is going on in the reclaimed range and nobody is using
 it so that range can be reclaimed without issues.

Currently if we take inode lock, that serializes read/write. But it does
not do anything for faults. So I add another semaphore fuse_inode->i_mmap_sem
for this purpose.  It can be used to serialize with faults.

As of now, I am adding taking this semaphore only in dax fault path and
not regular fault path because existing code does not have one. May
be existing code can benefit from it as well to take care of some
races, but that we can fix later if need be. For now, I am just focussing
only on DAX path which is new path.

Also added logic to take fuse_inode->i_mmap_sem in
truncate/punch_hole/open(O_TRUNC) path to make sure file truncation and
fuse dax fault are mutually exlusive and avoid all the above problems.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

6ae330ca

19 5月, 2020 1 次提交

fuse: always allow query of st_dev · 5157da2c

由 Miklos Szeredi 提交于 5月 19, 2020

Fuse mounts without "allow_other" are off-limits to all non-owners.  Yet it
makes sense to allow querying st_dev on the root, since this value is
provided by the kernel, not the userspace filesystem.

Allow statx(2) with a zero request mask to succeed on a fuse mounts for all
users.
Reported-by: NNikolaus Rath <Nikolaus@rath.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5157da2c

06 2月, 2020 1 次提交

fuse: Support RENAME_WHITEOUT flag · 519525fa

由 Vivek Goyal 提交于 2月 05, 2020

Allow fuse to pass RENAME_WHITEOUT to fuse server.  Overlayfs on top of
virtiofs uses RENAME_WHITEOUT.

Without this patch renaming a directory in overlayfs (dir is on lower)
fails with -EINVAL. With this patch it works.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

519525fa

12 11月, 2019 2 次提交

fuse: verify nlink · c634da71

由 Miklos Szeredi 提交于 11月 12, 2019

When adding a new hard link, make sure that i_nlink doesn't overflow.

Fixes: ac45d613 ("fuse: fix nlink after unlink")
Cc: <stable@vger.kernel.org> # v3.4
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c634da71

fuse: verify attributes · eb59bd17

由 Miklos Szeredi 提交于 11月 12, 2019

If a filesystem returns negative inode sizes, future reads on the file were
causing the cpu to spin on truncate_pagecache.

Create a helper to validate the attributes.  This now does two things:

 - check the file mode
 - check if the file size fits in i_size without overflowing
Reported-by: NArijit Banerjee <arijit@rubrik.com>
Fixes: d8a5ba45 ("[PATCH] FUSE - core")
Cc: <stable@vger.kernel.org> # v2.6.14
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

eb59bd17

23 10月, 2019 1 次提交

fuse: flush dirty data/metadata before non-truncate setattr · b24e7598

由 Miklos Szeredi 提交于 10月 23, 2019

If writeback cache is enabled, then writes might get reordered with
chmod/chown/utimes.  The problem with this is that performing the write in
the fuse daemon might itself change some of these attributes.  In such case
the following sequence of operations will result in file ending up with the
wrong mode, for example:

  int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL);
  write (fd, "1", 1);
  fchown (fd, 0, 0);
  fchmod (fd, 04755);
  close (fd);

This patch fixes this by flushing pending writes before performing
chown/chmod/utimes.
Reported-by: NGiuseppe Scrivano <gscrivan@redhat.com>
Tested-by: NGiuseppe Scrivano <gscrivan@redhat.com>
Fixes: 4d99ff8f ("fuse: Turn writeback cache on")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

b24e7598

21 10月, 2019 1 次提交

fuse: don't advise readdirplus for negative lookup · 6c26f717

由 Miklos Szeredi 提交于 10月 21, 2019

If the FUSE_READDIRPLUS_AUTO feature is enabled, then lookups on a
directory before/during readdir are used as an indication that READDIRPLUS
should be used instead of READDIR. However if the lookup turns out to be
negative, then selecting READDIRPLUS makes no sense.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

6c26f717

24 9月, 2019 2 次提交

fuse: kmemcg account fs data · dc69e98c

由 Khazhismel Kumykov 提交于 9月 17, 2019

account per-file, dentry, and inode data

blockdev/superblock and temporary per-request data was left alone, as
this usually isn't accounted
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Signed-off-by: NKhazhismel Kumykov <khazhy@google.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

dc69e98c

fuse: on 64-bit store time in d_fsdata directly · 30c6a23d

由 Khazhismel Kumykov 提交于 9月 16, 2019

Implements the optimization noted in commit f75fdf22 ("fuse: don't
use ->d_time"), as the additional memory can be significant.  (In
particular, on SLAB configurations this 8-byte alloc becomes 32 bytes).
Per-dentry, this can consume significant memory.
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Signed-off-by: NKhazhismel Kumykov <khazhy@google.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

30c6a23d

12 9月, 2019 1 次提交

fuse: delete dentry if timeout is zero · 8fab0106

由 Miklos Szeredi 提交于 8月 15, 2018

Don't hold onto dentry in lru list if need to re-lookup it anyway at next
access.  Only do this if explicitly enabled, otherwise it could result in
performance regression.

More advanced version of this patch would periodically flush out dentries
from the lru which have gone stale.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8fab0106

10 9月, 2019 2 次提交

fuse: convert readlink to simple api · 4c29afec

由 Miklos Szeredi 提交于 9月 10, 2019

Also turn BUG_ON into gracefully recovered WARN_ON.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4c29afec

fuse: flatten 'struct fuse_args' · d5b48543

由 Miklos Szeredi 提交于 9月 10, 2019

...to make future expansion simpler.  The hiearachical structure is a
historical thing that does not serve any practical purpose.

The generated code is excatly the same before and after the patch.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d5b48543

13 2月, 2019 4 次提交

fuse: Protect fi->nlookup with fi->lock · c9d8f5f0

由 Kirill Tkhai 提交于 11月 09, 2018

This continues previous patch and introduces the same protection for
nlookup field.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c9d8f5f0

fuse: Introduce fi->lock to protect write related fields · f15ecfef

由 Kirill Tkhai 提交于 11月 09, 2018

To minimize contention of fc->lock, this patch introduces a new spinlock
for protection fuse_inode metadata:

fuse_inode:
	writectr
	writepages
	write_files
	queued_writes
	attr_version

inode:
	i_size
	i_nlink
	i_mtime
	i_ctime

Also, it protects the fields changed in fuse_change_attributes_common()
(too many to list).
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

f15ecfef

fuse: Convert fc->attr_version into atomic64_t · 4510d86f

由 Kirill Tkhai 提交于 11月 09, 2018

This patch makes fc->attr_version of atomic64_t type, so fc->lock won't be
needed to read or modify it anymore.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4510d86f

fuse: Add fuse_inode argument to fuse_prepare_release() · ebf84d0c

由 Kirill Tkhai 提交于 11月 09, 2018

Here is preparation for next patches, which introduce new fi->lock for
protection of ff->write_entry linked into fi->write_files.

This patch just passes new argument to the function.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

ebf84d0c

12 12月, 2018 1 次提交

fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS · 2e64ff15

由 Chad Austin 提交于 12月 10, 2018

When FUSE_OPEN returns ENOSYS, the no_open bit is set on the connection.

Because the FUSE_RELEASE and FUSE_RELEASEDIR paths share code, this
incorrectly caused the FUSE_RELEASEDIR request to be dropped and never sent
to userspace.

Pass an isdir bool to distinguish between FUSE_RELEASE and FUSE_RELEASEDIR
inside of fuse_file_put.

Fixes: 7678ac50 ("fuse: support clients that don't implement 'open'")
Cc: <stable@vger.kernel.org> # v3.14
Signed-off-by: NChad Austin <chadaustin@fb.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2e64ff15

03 12月, 2018 2 次提交

fuse: fix revalidation of attributes for permission check · d233c7dd

由 Miklos Szeredi 提交于 12月 03, 2018

fuse_invalidate_attr() now sets fi->inval_mask instead of fi->i_time, hence
we need to check the inval mask in fuse_permission() as well.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 2f1e8196 ("fuse: allow fine grained attr cache invaldation")

d233c7dd

fuse: fix fsync on directory · a9c2d1e8

由 Miklos Szeredi 提交于 12月 03, 2018

Commit ab2257e9 ("fuse: reduce size of struct fuse_inode") moved parts
of fields related to writeback on regular file and to directory caching
into a union.  However fuse_fsync_common() called from fuse_dir_fsync()
touches some writeback related fields, resulting in a crash.

Move writeback related parts from fuse_fsync_common() to fuse_fysnc().
Reported-by: NBrett Girton <btgirton@gmail.com>
Tested-by: NBrett Girton <btgirton@gmail.com>
Fixes: ab2257e9 ("fuse: reduce size of struct fuse_inode")
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

a9c2d1e8

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功