提交 · 8d0980704842e8a68df2c3164c1c165e5c7ebc08 · openeuler / Kernel

23 10月, 2019 27 次提交

gfs2: add compat_ioctl support · 8d098070

由 Arnd Bergmann 提交于 6月 03, 2019

Out of the four ioctl commands supported on gfs2, only FITRIM
works in compat mode.

Add a proper handler based on the ext4 implementation.

Fixes: 6ddc5c3d ("gfs2: getlabel support")
Reviewed-by: NBob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

8d098070

compat_ioctl: remove unused convert_in_user macro · 0581f186

由 Arnd Bergmann 提交于 3月 14, 2019

The last users are all gone, so let's remove the macro as well.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

0581f186

compat_ioctl: remove last RAID handling code · caca7d10

由 Arnd Bergmann 提交于 9月 07, 2018

Commit aa98aa31 ("md: move compat_ioctl handling into md.c")
already removed the COMPATIBLE_IOCTL() table entries and added
a complete implementation, but a few lines got left behind and
should also be removed here.

Cc: linux-raid@vger.kernel.org
Cc: Song Liu <song@kernel.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

caca7d10

compat_ioctl: remove /dev/raw ioctl translation · 50a2e74b

由 Arnd Bergmann 提交于 9月 07, 2018

The /dev/rawX implementation already handles these just fine, so
the entries in the table are not needed any more.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

50a2e74b

compat_ioctl: remove PCI ioctl translation · a92d4f10

由 Arnd Bergmann 提交于 9月 07, 2018

The /proc/pci/ implementation already handles these just fine, so
the entries in the table are not needed any more.

Cc: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

a92d4f10

compat_ioctl: remove joystick ioctl translation · aca94226

由 Arnd Bergmann 提交于 9月 07, 2018

The joystick driver already handles these just fine, so
the entries in the table are not needed any more.

Cc: linux-input@vger.kernel.org
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

aca94226

compat_ioctl: remove /dev/random commands · 507e4e2b

由 Arnd Bergmann 提交于 9月 07, 2018

These are all handled by the random driver, so instead of listing
each ioctl, we can use the generic compat_ptr_ioctl() helper.
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

507e4e2b

compat_ioctl: remove IGNORE_IOCTL() · eede0b85

由 Arnd Bergmann 提交于 9月 07, 2018

Since commit 07d106d0 ("vfs: fix up ENOIOCTLCMD error handling"),
we don't warn about unhandled compat-ioctl command code any more, but
just return the same error that a native file descriptor returns when
there is no handler.

This means the IGNORE_IOCTL() annotations are completely useless and
can all be removed. TIOCSTART/TIOCSTOP and KDGHWCLK/KDSHWCLK fall into
the same category, but for some reason were listed as COMPATIBLE_IOCTL().
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

eede0b85

compat_ioctl: remove translation for sound ioctls · 2022ca0a

由 Arnd Bergmann 提交于 9月 06, 2018

The SNDCTL_* and SOUND_* commands are the old OSS user interface.

I checked all the sound ioctl commands listed in fs/compat_ioctl.c
to see if we still need the translation handlers. Here is what I
found:

- sound/oss/ is (almost) gone from the kernel, this is what actually
  needed all the translations
- The ALSA emulation for OSS correctly handles all compat_ioctl
  commands already.
- sound/oss/dmasound/ is the last holdout of the original OSS code,
  this is only used on arch/m68k, which has no 64-bit mode and
  hence needs no compat handlers
- arch/um/drivers/hostaudio_kern.c may run in 64-bit mode with
  32-bit x86 user space underneath it. This rare corner case is
  the only one that still needs the compat handlers.

By adding a simple redirect of .compat_ioctl to .unlocked_ioctl in the
UML driver, we can remove all the COMPATIBLE_IOCTL() annotations without
a change in functionality. For completeness, I'm adding the same thing
to the dmasound file, knowing that it makes no difference.

The compat_ioctl list contains one comment about SNDCTL_DSP_MAPINBUF and
SNDCTL_DSP_MAPOUTBUF, which actually would need a translation handler
if implemented. However, the native implementation just returns -EINVAL,
so we don't care.
Reviewed-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

2022ca0a

compat_ioctl: remove HIDIO translation · 54b5b60a

由 Arnd Bergmann 提交于 9月 06, 2018

The two drivers implementing these both gained proper compat_ioctl()
handlers a long time ago with commits bb6c8d8f ("HID: hiddev:
Add 32bit ioctl compatibilty") and ae5e49c7 ("HID: hidraw: add
compatibility ioctl() for 32-bit applications."), so the lists in
fs/compat_ioctl.c are no longer used.

It appears that the lists were also incomplete, so the translation
didn't actually work correctly when it was still in use.

Remove them as cleanup.

Cc: linux-bluetooth@vger.kernel.org
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

54b5b60a

compat_ioctl: remove HCIUART handling · 61798109

由 Arnd Bergmann 提交于 3月 14, 2019

As of commit f0193d3e ("change semantics of ldisc ->compat_ioctl()"),
all hciuart ioctl commands are handled correctly in the driver, and we
never need to go through the table here.

Cc: linux-bluetooth@vger.kernel.org
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

61798109

compat_ioctl: move hci_sock handlers into driver · 7a6038b3

由 Arnd Bergmann 提交于 3月 14, 2019

All these ioctl commands are compatible, so we can handle
them with a trivial wrapper in hci_sock.c and remove
the listing in fs/compat_ioctl.c.

A few of the commands pass integer arguments instead of
pointers, so for correctness skip the compat_ptr() conversion
here.
Acked-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

7a6038b3

compat_ioctl: move rfcomm handlers into driver · 7d60a7a6

由 Arnd Bergmann 提交于 3月 14, 2019

All these ioctl commands are compatible, so we can handle
them with a trivial wrapper in rfcomm/sock.c and remove
the listing in fs/compat_ioctl.c.
Acked-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

7d60a7a6

compat_ioctl: move isdn/capi ioctl translation into driver · 5565a3ca

由 Arnd Bergmann 提交于 9月 06, 2018

Neither the old isdn4linux interface nor the newer mISDN stack
ever had working 32-bit compat mode as far as I can tell.

However, the CAPI stack has some ioctl commands that are
correctly listed in fs/compat_ioctl.c.

We can trivially move all of those into the corresponding
file that implement the native handlers by adding a compat_ioctl
redirect to that.

I did notice that treating CAPI_MANUFACTURER_CMD() as compatible
is broken, so I'm also adding a handler for that, realizing that
in all likelyhood, nobody is ever going to call it.

Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: netdev@vger.kernel.org
Cc: isdn4linux@listserv.isdn4linux.de
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

5565a3ca

compat_ioctl: move ATYFB_CLK handling to atyfb driver · 0ba9841a

由 Arnd Bergmann 提交于 3月 14, 2019

These are two obscure ioctl commands, in a driver that only
has compatible commands, so just let the driver handle this
itself.
Acked-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

0ba9841a

compat_ioctl: move tape handling into drivers · 1207045d

由 Arnd Bergmann 提交于 9月 07, 2018

MTIOCPOS and MTIOCGET are incompatible between 32-bit and 64-bit user
space, and traditionally have been translated in fs/compat_ioctl.c.

To get rid of that translation handler, move a corresponding
implementation into each of the four drivers implementing those commands.

The interesting part of that is now in a new linux/mtio.h header that
wraps the existing uapi/linux/mtio.h header and provides an abstraction
to let drivers handle both cases easily. Using an in_compat_syscall()
check, the caller does not have to keep track of whether this was
called through .unlocked_ioctl() or .compat_ioctl().
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: "Kai Mäkisara" <Kai.Makisara@kolumbus.fi>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

1207045d

compat_ioctl: move more drivers to compat_ptr_ioctl · 1832f2d8

由 Arnd Bergmann 提交于 9月 11, 2018

The .ioctl and .compat_ioctl file operations have the same prototype so
they can both point to the same function, which works great almost all
the time when all the commands are compatible.

One exception is the s390 architecture, where a compat pointer is only
31 bit wide, and converting it into a 64-bit pointer requires calling
compat_ptr(). Most drivers here will never run in s390, but since we now
have a generic helper for it, it's easy enough to use it consistently.

I double-checked all these drivers to ensure that all ioctl arguments
are used as pointers or are ignored, but are not interpreted as integer
values.
Acked-by: NJason Gunthorpe <jgg@mellanox.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: NDavid Sterba <dsterba@suse.com>
Acked-by: NDarren Hart (VMware) <dvhart@infradead.org>
Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

1832f2d8

compat_ioctl: move drivers to compat_ptr_ioctl · 407e9ef7

由 Arnd Bergmann 提交于 9月 11, 2018

Each of these drivers has a copy of the same trivial helper function to
convert the pointer argument and then call the native ioctl handler.

We now have a generic implementation of that, so use it.
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NJiri Kosina <jkosina@suse.cz>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

407e9ef7

compat_ioctl: move rtc handling into drivers/rtc/dev.c · 076ff658

由 Arnd Bergmann 提交于 8月 24, 2018

We no longer need the rtc compat handling to be in common code, now that
all drivers are either moved to the rtc-class framework, or (rarely)
exist in drivers/char for architectures without compat mode (m68k,
alpha and ia64, respectively).

I checked the list of ioctl commands in drivers, and the ones that are
not already handled are all compatible, again with the one exception of
m68k driver, which implements RTC_PLL_GET and RTC_PLL_SET, but has no
compat mode.

Unlike earlier versions of this patch, I'm now adding a separate
compat_ioctl handler that takes care of RTC_IRQP_READ32/RTC_IRQP_SET32
and treats all other commands as compatible, leaving the native
behavior unchanged.

The old conversion handler also deals with RTC_EPOCH_READ and
RTC_EPOCH_SET, which are not handled in rtc-dev.c but only in a single
device driver (rtc-vr41xx), so I'm adding the compat version in the same
place. I don't expect other drivers to need those commands in the future.
Acked-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: NBen Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
---
v4: handle RTC_EPOCH_SET32 in rtc_dev_compat_ioctl
v3: handle RTC_IRQP_READ32/RTC_IRQP_SET32 in rtc_dev_compat_ioctl
v2: merge compat handler into ioctl function to avoid the
    compat_alloc_user_space() roundtrip, based on feedback
    from Al Viro.

076ff658

ceph: fix compat_ioctl for ceph_dir_operations · 18bd6caa

由 Arnd Bergmann 提交于 9月 11, 2018

The ceph_ioctl function is used both for files and directories, but only
the files support doing that in 32-bit compat mode.

On the s390 architecture, there is also a problem with invalid 31-bit
pointers that need to be passed through compat_ptr().

Use the new compat_ptr_ioctl() to address both issues.

Note: When backporting this patch to stable kernels, "compat_ioctl:
add compat_ptr_ioctl()" is needed as well.
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

18bd6caa

compat_sys_ioctl(): make parallel to do_vfs_ioctl() · 37ecf8b2

由 Al Viro 提交于 4月 21, 2019

Handle ioctls that might be handled without reaching ->ioctl() in
native case on the top level there.  The counterpart of vfs_ioctl()
(i.e. calling ->unlock_ioctl(), etc.) left as-is; eventually
that would turn simply into the call of ->compat_ioctl(), but
that'll take more work.  Once that is done, we can move the
remains of compat_sys_ioctl() into fs/ioctl.c and finally bury
fs/compat_ioctl.c.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

37ecf8b2

compat: move FS_IOC_RESVSP_32 handling to fs/ioctl.c · 011da44b

由 Al Viro 提交于 4月 21, 2019

... and lose the ridiculous games with compat_alloc_user_space()
there.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

011da44b

do_vfs_ioctl(): use saner types · 34d3d0e6

由 Al Viro 提交于 4月 21, 2019

casting to pointer to int, only to pass that to function that
takes pointer to void and uses it as pointer to structure is
really asking for trouble.

"Some pointer, I'm not sure what to" is spelled "void *",
not "int *"; use that.

And declare the functions we are passing that pointer to
as taking the pointer to what they really want to access.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

34d3d0e6

compat: itanic doesn't have one · bf0a199b

由 Al Viro 提交于 4月 21, 2019

... and hadn't for a long time.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

bf0a199b

FIGETBSZ: fix compat · ee26025f

由 Al Viro 提交于 4月 21, 2019

it takes a pointer argument, regular file or no regular file
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

ee26025f

fix compat handling of FICLONERANGE, FIDEDUPERANGE and FS_IOC_FIEMAP · 6b2daec1

由 Al Viro 提交于 4月 21, 2019

Unlike FICLONE, all of those take a pointer argument; they do need
compat_ptr() applied to arg.

Fixes: d79bdd52 ("vfs: wire up compat ioctl for CLONE/CLONE_RANGE")
Fixes: 54dbc151 ("vfs: hoist the btrfs deduplication ioctl to the vfs")
Fixes: ceac204e ("fs: make fiemap work from compat_ioctl")
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

6b2daec1

compat_ioctl: add compat_ptr_ioctl() · 2952db0f

由 Arnd Bergmann 提交于 9月 11, 2018

Many drivers have ioctl() handlers that are completely compatible between
32-bit and 64-bit architectures, except for the argument that is passed
down from user space and may have to be passed through compat_ptr()
in order to become a valid 64-bit pointer.

Using ".compat_ptr = compat_ptr_ioctl" in file operations should let
us simplify a lot of those drivers to avoid #ifdef checks, and convert
additional drivers that don't have proper compat handling yet.

On most architectures, the compat_ptr_ioctl() just passes all arguments
to the corresponding ->ioctl handler. The exception is arch/s390, where
compat_ptr() clears the top bit of a 32-bit pointer value, so user space
pointers to the second 2GB alias the first 2GB, as is the case for native
32-bit s390 user space.

The compat_ptr_ioctl() function must therefore be used only with
ioctl functions that either ignore the argument or pass a pointer to a
compatible data type.

If any ioctl command handled by fops->unlocked_ioctl passes a plain
integer instead of a pointer, or any of the passed data types is
incompatible between 32-bit and 64-bit architectures, a proper handler
is required instead of compat_ptr_ioctl.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
---
v3: add a better description
v2: use compat_ptr_ioctl instead of generic_compat_ioctl_ptrarg,
as suggested by Al Viro

2952db0f

07 10月, 2019 1 次提交

elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings · b212921b

由 Linus Torvalds 提交于 10月 06, 2019

In commit 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") we
changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the
executable mappings.

Then, people reported that it broke some binaries that had overlapping
segments from the same file, and commit ad55eac7 ("elf: enforce
MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some
overlaying elf segment cases.  But only some - despite the summary line
of that commit, it only did it when it also does a temporary brk vma for
one obvious overlapping case.

Now Russell King reports another overlapping case with old 32-bit x86
binaries, which doesn't trigger that limited case.  End result: we had
better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED.

Yes, it's a sign of old binaries generated with old tool-chains, but we
do pride ourselves on not breaking existing setups.

This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp()
and the old load_elf_library() use-cases, because nobody has reported
breakage for those. Yet.

Note that in all the cases seen so far, the overlapping elf sections
seem to be just re-mapping of the same executable with different section
attributes.  We could possibly introduce a new MAP_FIXED_NOFILECHANGE
flag or similar, which acts like NOREPLACE, but allows just remapping
the same executable file using different protection flags.

It's not clear that would make a huge difference to anything, but if
people really hate that "elf remaps over previous maps" behavior, maybe
at least a more limited form of remapping would alleviate some concerns.

Alternatively, we should take a look at our elf_map() logic to see if we
end up not mapping things properly the first time.

In the meantime, this is the minimal "don't do that then" patch while
people hopefully think about it more.
Reported-by: NRussell King <linux@armlinux.org.uk>
Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map")
Fixes: ad55eac7 ("elf: enforce  MAP_FIXED on overlaying elf segments")
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b212921b

06 10月, 2019 2 次提交

Make filldir[64]() verify the directory entry filename is valid · 8a23eb80

由 Linus Torvalds 提交于 10月 05, 2019

This has been discussed several times, and now filesystem people are
talking about doing it individually at the filesystem layer, so head
that off at the pass and just do it in getdents{64}().

This is partially based on a patch by Jann Horn, but checks for NUL
bytes as well, and somewhat simplified.

There's also commentary about how it might be better if invalid names
due to filesystem corruption don't cause an immediate failure, but only
an error at the end of the readdir(), so that people can still see the
filenames that are ok.

There's also been discussion about just how much POSIX strictly speaking
requires this since it's about filesystem corruption.  It's really more
"protect user space from bad behavior" as pointed out by Jann.  But
since Eric Biederman looked up the POSIX wording, here it is for context:

 "From readdir:

   The readdir() function shall return a pointer to a structure
   representing the directory entry at the current position in the
   directory stream specified by the argument dirp, and position the
   directory stream at the next entry. It shall return a null pointer
   upon reaching the end of the directory stream. The structure dirent
   defined in the <dirent.h> header describes a directory entry.

  From definitions:

   3.129 Directory Entry (or Link)

   An object that associates a filename with a file. Several directory
   entries can associate names with the same file.

  ...

   3.169 Filename

   A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
   characters composing the name may be selected from the set of all
   character values excluding the slash character and the null byte. The
   filenames dot and dot-dot have special meaning. A filename is
   sometimes referred to as a 'pathname component'."

Note that I didn't bother adding the checks to any legacy interfaces
that nobody uses.

Also note that if this ends up being noticeable as a performance
regression, we can fix that to do a much more optimized model that
checks for both NUL and '/' at the same time one word at a time.

We haven't really tended to optimize 'memchr()', and it only checks for
one pattern at a time anyway, and we really _should_ check for NUL too
(but see the comment about "soft errors" in the code about why it
currently only checks for '/')

See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
lookup code looks for pathname terminating characters in parallel.

Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8a23eb80

Convert filldir[64]() from __put_user() to unsafe_put_user() · 9f79b78e

由 Linus Torvalds 提交于 5月 21, 2016

We really should avoid the "__{get,put}_user()" functions entirely,
because they can easily be mis-used and the original intent of being
used for simple direct user accesses no longer holds in a post-SMAP/PAN
world.

Manually optimizing away the user access range check makes no sense any
more, when the range check is generally much cheaper than the "enable
user accesses" code that the __{get,put}_user() functions still need.

So instead of __put_user(), use the unsafe_put_user() interface with
user_access_{begin,end}() that really does generate better code these
days, and which is generally a nicer interface. Under some loads, the
multiple user writes that filldir() does are actually quite noticeable.

This also makes the dirent name copy use unsafe_put_user() with a couple
of macros. We do not want to make function calls with SMAP/PAN
disabled, and the code this generates is quite good when the
architecture uses "asm goto" for unsafe_put_user() like x86 does.

Note that this doesn't bother with the legacy cases. Nobody should use
them anyway, so performance doesn't really matter there.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9f79b78e

04 10月, 2019 1 次提交

vfs: Fix EOVERFLOW testing in put_compat_statfs64 · cc3a7bfe

由 Eric Sandeen 提交于 10月 02, 2019

Today, put_compat_statfs64() disallows nearly any field value over
2^32 if f_bsize is only 32 bits, but that makes no sense.
compat_statfs64 is there for the explicit purpose of providing 64-bit
fields for f_files, f_ffree, etc.  And f_bsize is always only 32 bits.

As a result, 32-bit userspace gets -EOVERFLOW for i.e.  large file
counts even with -D_FILE_OFFSET_BITS=64 set.

In reality, only f_bsize and f_frsize can legitimately overflow
(fields like f_type and f_namelen should never be large), so test
only those fields.

This bug was discussed at length some time ago, and this is the proposal
Al suggested at https://lkml.org/lkml/2018/8/6/640.  It seemed to get
dropped amid the discussion of other related changes, but this
part seems obviously correct on its own, so I've picked it up and
sent it, for expediency.

Fixes: 64d2ab32 ("vfs: fix put_compat_statfs64() does not handle errors")
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cc3a7bfe

01 10月, 2019 4 次提交

io_uring: use __kernel_timespec in timeout ABI · bdf20073

由 Arnd Bergmann 提交于 10月 01, 2019

All system calls use struct __kernel_timespec instead of the old struct
timespec, but this one was just added with the old-style ABI. Change it
now to enforce the use of __kernel_timespec, avoiding ABI confusion and
the need for compat handlers on 32-bit architectures.

Any user space caller will have to use __kernel_timespec now, but this
is unambiguous and works for any C library regardless of the time_t
definition. A nicer way to specify the timeout would have been a less
ambiguous 64-bit nanosecond value, but I suppose it's too late now to
change that as this would impact both 32-bit and 64-bit users.

Fixes: 5262f567 ("io_uring: IORING_OP_TIMEOUT support")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bdf20073

erofs: fix mis-inplace determination related with noio chain · dc76ea8c

由 Gao Xiang 提交于 9月 22, 2019

Fix a recent cleanup patch. noio (bypass) chain is
handled asynchronously against submit chain, therefore
inplace I/O or pagevec cannot be applied to such pages.
Add detailed comment for this as well.

Fixes: 97e86a85 ("staging: erofs: tidy up decompression frontend")
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20190922100434.229340-1-gaoxiang25@huawei.comSigned-off-by: NGao Xiang <gaoxiang25@huawei.com>

dc76ea8c

erofs: fix erofs_get_meta_page locking due to a cleanup · 55252ab7

由 Gao Xiang 提交于 9月 22, 2019

After doing more drop_caches stress test on
our products, I found the mistake introduced by
a very recent cleanup [1].

The current rule is that "erofs_get_meta_page"
should be returned with page locked (although
it's mostly unnecessary for read-only fs after
pages are PG_uptodate), but a fix should be
done for this.

[1] https://lore.kernel.org/r/20190904020912.63925-26-gaoxiang25@huawei.com
Fixes: 618f40ea ("erofs: use read_cache_page_gfp for erofs_get_meta_page")
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20190921184355.149928-1-gaoxiang25@huawei.comSigned-off-by: NGao Xiang <gaoxiang25@huawei.com>

55252ab7

erofs: fix return value check in erofs_read_superblock() · 517d6b9c

由 Wei Yongjun 提交于 9月 18, 2019

In case of error, the function read_mapping_page() returns
ERR_PTR() not NULL. The NULL test in the return value check
should be replaced with IS_ERR().

Fixes: fe7c2423 ("erofs: use read_mapping_page instead of sb_bread")
Reviewed-by: NGao Xiang <gaoxiang25@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20190918083033.47780-1-weiyongjun1@huawei.comSigned-off-by: NGao Xiang <gaoxiang25@huawei.com>

517d6b9c

30 9月, 2019 1 次提交

Revert "Revert "ext4: make __ext4_get_inode_loc plug"" · 02f03c42

由 Linus Torvalds 提交于 9月 29, 2019

This reverts commit 72dbcf72.

Instead of waiting forever for entropy that may just not happen, we now
try to actively generate entropy when required, and are thus hopefully
avoiding the problem that caused the nice ext4 IO pattern fix to be
reverted.

So revert the revert.

Cc: Ahmed S. Darwish <darwish.07@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Alexander E. Patrakov <patrakov@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

02f03c42

27 9月, 2019 4 次提交

btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls · d4e20494

由 Qu Wenruo 提交于 9月 16, 2019

[BUG]
The following script can cause btrfs qgroup data space leak:

  mkfs.btrfs -f $dev
  mount $dev -o nospace_cache $mnt

  btrfs subv create $mnt/subv
  btrfs quota en $mnt
  btrfs quota rescan -w $mnt
  btrfs qgroup limit 128m $mnt/subv

  for (( i = 0; i < 3; i++)); do
          # Create 3 64M holes for latter fallocate to fail
          truncate -s 192m $mnt/subv/file
          xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
          xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
          sync

          # it's supposed to fail, and each failure will leak at least 64M
          # data space
          xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
          rm $mnt/subv/file
          sync
  done

  # Shouldn't fail after we removed the file
  xfs_io -f -c "falloc 0 64m" $mnt/subv/file

[CAUSE]
Btrfs qgroup data reserve code allow multiple reservations to happen on
a single extent_changeset:
E.g:
	btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_1M);
	btrfs_qgroup_reserve_data(inode, &data_reserved, SZ_1M, SZ_2M);
	btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_4M);

Btrfs qgroup code has its internal tracking to make sure we don't
double-reserve in above example.

The only pattern utilizing this feature is in the main while loop of
btrfs_fallocate() function.

However btrfs_qgroup_reserve_data()'s error handling has a bug in that
on error it clears all ranges in the io_tree with EXTENT_QGROUP_RESERVED
flag but doesn't free previously reserved bytes.

This bug has a two fold effect:
- Clearing EXTENT_QGROUP_RESERVED ranges
  This is the correct behavior, but it prevents
  btrfs_qgroup_check_reserved_leak() to catch the leakage as the
  detector is purely EXTENT_QGROUP_RESERVED flag based.

- Leak the previously reserved data bytes.

The bug manifests when N calls to btrfs_qgroup_reserve_data are made and
the last one fails, leaking space reserved in the previous ones.

[FIX]
Also free previously reserved data bytes when btrfs_qgroup_reserve_data
fails.

Fixes: 52472553 ("btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d4e20494

btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space · bab32fc0

由 Qu Wenruo 提交于 9月 16, 2019

[BUG]
Under the following case with qgroup enabled, if some error happened
after we have reserved delalloc space, then in error handling path, we
could cause qgroup data space leakage:

From btrfs_truncate_block() in inode.c:

	ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
					   block_start, blocksize);
	if (ret)
		goto out;

 again:
	page = find_or_create_page(mapping, index, mask);
	if (!page) {
		btrfs_delalloc_release_space(inode, data_reserved,
					     block_start, blocksize, true);
		btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
		ret = -ENOMEM;
		goto out;
	}

[CAUSE]
In the above case, btrfs_delalloc_reserve_space() will call
btrfs_qgroup_reserve_data() and mark the io_tree range with
EXTENT_QGROUP_RESERVED flag.

In the error handling path, we have the following call stack:
btrfs_delalloc_release_space()
|- btrfs_free_reserved_data_space()
   |- btrsf_qgroup_free_data()
      |- __btrfs_qgroup_release_data(reserved=@reserved, free=1)
         |- qgroup_free_reserved_data(reserved=@reserved)
            |- clear_record_extent_bits();
            |- freed += changeset.bytes_changed;

However due to a completion bug, qgroup_free_reserved_data() will clear
EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
than the correct BTRFS_I(inode)->io_tree.
Since io_failure_tree is never marked with that flag,
btrfs_qgroup_free_data() will not free any data reserved space at all,
causing a leakage.

This type of error handling can only be triggered by errors outside of
qgroup code. So EDQUOT error from qgroup can't trigger it.

[FIX]
Fix the wrong target io_tree.
Reported-by: NJosef Bacik <josef@toxicpanda.com>
Fixes: bc42bda2 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

bab32fc0

CIFS: Fix oplock handling for SMB 2.1+ protocols · a016e279

由 Pavel Shilovsky 提交于 9月 26, 2019

There may be situations when a server negotiates SMB 2.1
protocol version or higher but responds to a CREATE request
with an oplock rather than a lease.

Currently the client doesn't handle such a case correctly:
when another CREATE comes in the server sends an oplock
break to the initial CREATE and the client doesn't send
an ack back due to a wrong caching level being set (READ
instead of RWH). Missing an oplock break ack makes the
server wait until the break times out which dramatically
increases the latency of the second CREATE.

Fix this by properly detecting oplocks when using SMB 2.1
protocol version and higher.

Cc: <stable@vger.kernel.org>
Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: NSteve French <stfrench@microsoft.com>
Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>

a016e279

smb3: missing ACL related flags · ff3ee62a

由 Steve French 提交于 9月 26, 2019

Various SMB3 ACL related flags (for security descriptor and
ACEs for example) were missing and some fields are different
in SMB3 and CIFS. Update cifsacl.h definitions based on
current MS-DTYP specification.
Signed-off-by: NSteve French <stfrench@microsoft.com>
Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: NAurelien Aptel <aaptel@suse.com>

ff3ee62a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功