- 23 10月, 2019 27 次提交
-
-
由 Arnd Bergmann 提交于
Out of the four ioctl commands supported on gfs2, only FITRIM works in compat mode. Add a proper handler based on the ext4 implementation. Fixes: 6ddc5c3d ("gfs2: getlabel support") Reviewed-by: NBob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
The last users are all gone, so let's remove the macro as well. Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
Commit aa98aa31 ("md: move compat_ioctl handling into md.c") already removed the COMPATIBLE_IOCTL() table entries and added a complete implementation, but a few lines got left behind and should also be removed here. Cc: linux-raid@vger.kernel.org Cc: Song Liu <song@kernel.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
The /dev/rawX implementation already handles these just fine, so the entries in the table are not needed any more. Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
The /proc/pci/ implementation already handles these just fine, so the entries in the table are not needed any more. Cc: linux-pci@vger.kernel.org Cc: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
The joystick driver already handles these just fine, so the entries in the table are not needed any more. Cc: linux-input@vger.kernel.org Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
These are all handled by the random driver, so instead of listing each ioctl, we can use the generic compat_ptr_ioctl() helper. Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
Since commit 07d106d0 ("vfs: fix up ENOIOCTLCMD error handling"), we don't warn about unhandled compat-ioctl command code any more, but just return the same error that a native file descriptor returns when there is no handler. This means the IGNORE_IOCTL() annotations are completely useless and can all be removed. TIOCSTART/TIOCSTOP and KDGHWCLK/KDSHWCLK fall into the same category, but for some reason were listed as COMPATIBLE_IOCTL(). Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
The SNDCTL_* and SOUND_* commands are the old OSS user interface. I checked all the sound ioctl commands listed in fs/compat_ioctl.c to see if we still need the translation handlers. Here is what I found: - sound/oss/ is (almost) gone from the kernel, this is what actually needed all the translations - The ALSA emulation for OSS correctly handles all compat_ioctl commands already. - sound/oss/dmasound/ is the last holdout of the original OSS code, this is only used on arch/m68k, which has no 64-bit mode and hence needs no compat handlers - arch/um/drivers/hostaudio_kern.c may run in 64-bit mode with 32-bit x86 user space underneath it. This rare corner case is the only one that still needs the compat handlers. By adding a simple redirect of .compat_ioctl to .unlocked_ioctl in the UML driver, we can remove all the COMPATIBLE_IOCTL() annotations without a change in functionality. For completeness, I'm adding the same thing to the dmasound file, knowing that it makes no difference. The compat_ioctl list contains one comment about SNDCTL_DSP_MAPINBUF and SNDCTL_DSP_MAPOUTBUF, which actually would need a translation handler if implemented. However, the native implementation just returns -EINVAL, so we don't care. Reviewed-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
The two drivers implementing these both gained proper compat_ioctl() handlers a long time ago with commits bb6c8d8f ("HID: hiddev: Add 32bit ioctl compatibilty") and ae5e49c7 ("HID: hidraw: add compatibility ioctl() for 32-bit applications."), so the lists in fs/compat_ioctl.c are no longer used. It appears that the lists were also incomplete, so the translation didn't actually work correctly when it was still in use. Remove them as cleanup. Cc: linux-bluetooth@vger.kernel.org Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
As of commit f0193d3e ("change semantics of ldisc ->compat_ioctl()"), all hciuart ioctl commands are handled correctly in the driver, and we never need to go through the table here. Cc: linux-bluetooth@vger.kernel.org Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
All these ioctl commands are compatible, so we can handle them with a trivial wrapper in hci_sock.c and remove the listing in fs/compat_ioctl.c. A few of the commands pass integer arguments instead of pointers, so for correctness skip the compat_ptr() conversion here. Acked-by: NMarcel Holtmann <marcel@holtmann.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
All these ioctl commands are compatible, so we can handle them with a trivial wrapper in rfcomm/sock.c and remove the listing in fs/compat_ioctl.c. Acked-by: NMarcel Holtmann <marcel@holtmann.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
Neither the old isdn4linux interface nor the newer mISDN stack ever had working 32-bit compat mode as far as I can tell. However, the CAPI stack has some ioctl commands that are correctly listed in fs/compat_ioctl.c. We can trivially move all of those into the corresponding file that implement the native handlers by adding a compat_ioctl redirect to that. I did notice that treating CAPI_MANUFACTURER_CMD() as compatible is broken, so I'm also adding a handler for that, realizing that in all likelyhood, nobody is ever going to call it. Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Cc: isdn4linux@listserv.isdn4linux.de Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
These are two obscure ioctl commands, in a driver that only has compatible commands, so just let the driver handle this itself. Acked-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
MTIOCPOS and MTIOCGET are incompatible between 32-bit and 64-bit user space, and traditionally have been translated in fs/compat_ioctl.c. To get rid of that translation handler, move a corresponding implementation into each of the four drivers implementing those commands. The interesting part of that is now in a new linux/mtio.h header that wraps the existing uapi/linux/mtio.h header and provides an abstraction to let drivers handle both cases easily. Using an in_compat_syscall() check, the caller does not have to keep track of whether this was called through .unlocked_ioctl() or .compat_ioctl(). Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Cc: "Kai Mäkisara" <Kai.Makisara@kolumbus.fi> Cc: linux-scsi@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
The .ioctl and .compat_ioctl file operations have the same prototype so they can both point to the same function, which works great almost all the time when all the commands are compatible. One exception is the s390 architecture, where a compat pointer is only 31 bit wide, and converting it into a 64-bit pointer requires calling compat_ptr(). Most drivers here will never run in s390, but since we now have a generic helper for it, it's easy enough to use it consistently. I double-checked all these drivers to ensure that all ioctl arguments are used as pointers or are ignored, but are not interpreted as integer values. Acked-by: NJason Gunthorpe <jgg@mellanox.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NDavid Sterba <dsterba@suse.com> Acked-by: NDarren Hart (VMware) <dvhart@infradead.org> Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org> Acked-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
Each of these drivers has a copy of the same trivial helper function to convert the pointer argument and then call the native ioctl handler. We now have a generic implementation of that, so use it. Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJiri Kosina <jkosina@suse.cz> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
We no longer need the rtc compat handling to be in common code, now that all drivers are either moved to the rtc-class framework, or (rarely) exist in drivers/char for architectures without compat mode (m68k, alpha and ia64, respectively). I checked the list of ioctl commands in drivers, and the ones that are not already handled are all compatible, again with the one exception of m68k driver, which implements RTC_PLL_GET and RTC_PLL_SET, but has no compat mode. Unlike earlier versions of this patch, I'm now adding a separate compat_ioctl handler that takes care of RTC_IRQP_READ32/RTC_IRQP_SET32 and treats all other commands as compatible, leaving the native behavior unchanged. The old conversion handler also deals with RTC_EPOCH_READ and RTC_EPOCH_SET, which are not handled in rtc-dev.c but only in a single device driver (rtc-vr41xx), so I'm adding the compat version in the same place. I don't expect other drivers to need those commands in the future. Acked-by: NAlexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: NBen Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: NArnd Bergmann <arnd@arndb.de> --- v4: handle RTC_EPOCH_SET32 in rtc_dev_compat_ioctl v3: handle RTC_IRQP_READ32/RTC_IRQP_SET32 in rtc_dev_compat_ioctl v2: merge compat handler into ioctl function to avoid the compat_alloc_user_space() roundtrip, based on feedback from Al Viro.
-
由 Arnd Bergmann 提交于
The ceph_ioctl function is used both for files and directories, but only the files support doing that in 32-bit compat mode. On the s390 architecture, there is also a problem with invalid 31-bit pointers that need to be passed through compat_ptr(). Use the new compat_ptr_ioctl() to address both issues. Note: When backporting this patch to stable kernels, "compat_ioctl: add compat_ptr_ioctl()" is needed as well. Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Al Viro 提交于
Handle ioctls that might be handled without reaching ->ioctl() in native case on the top level there. The counterpart of vfs_ioctl() (i.e. calling ->unlock_ioctl(), etc.) left as-is; eventually that would turn simply into the call of ->compat_ioctl(), but that'll take more work. Once that is done, we can move the remains of compat_sys_ioctl() into fs/ioctl.c and finally bury fs/compat_ioctl.c. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Al Viro 提交于
... and lose the ridiculous games with compat_alloc_user_space() there. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Al Viro 提交于
casting to pointer to int, only to pass that to function that takes pointer to void and uses it as pointer to structure is really asking for trouble. "Some pointer, I'm not sure what to" is spelled "void *", not "int *"; use that. And declare the functions we are passing that pointer to as taking the pointer to what they really want to access. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Al Viro 提交于
... and hadn't for a long time. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Al Viro 提交于
it takes a pointer argument, regular file or no regular file Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Al Viro 提交于
Unlike FICLONE, all of those take a pointer argument; they do need compat_ptr() applied to arg. Fixes: d79bdd52 ("vfs: wire up compat ioctl for CLONE/CLONE_RANGE") Fixes: 54dbc151 ("vfs: hoist the btrfs deduplication ioctl to the vfs") Fixes: ceac204e ("fs: make fiemap work from compat_ioctl") Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Arnd Bergmann 提交于
Many drivers have ioctl() handlers that are completely compatible between 32-bit and 64-bit architectures, except for the argument that is passed down from user space and may have to be passed through compat_ptr() in order to become a valid 64-bit pointer. Using ".compat_ptr = compat_ptr_ioctl" in file operations should let us simplify a lot of those drivers to avoid #ifdef checks, and convert additional drivers that don't have proper compat handling yet. On most architectures, the compat_ptr_ioctl() just passes all arguments to the corresponding ->ioctl handler. The exception is arch/s390, where compat_ptr() clears the top bit of a 32-bit pointer value, so user space pointers to the second 2GB alias the first 2GB, as is the case for native 32-bit s390 user space. The compat_ptr_ioctl() function must therefore be used only with ioctl functions that either ignore the argument or pass a pointer to a compatible data type. If any ioctl command handled by fops->unlocked_ioctl passes a plain integer instead of a pointer, or any of the passed data types is incompatible between 32-bit and 64-bit architectures, a proper handler is required instead of compat_ptr_ioctl. Signed-off-by: NArnd Bergmann <arnd@arndb.de> --- v3: add a better description v2: use compat_ptr_ioctl instead of generic_compat_ioctl_ptrarg, as suggested by Al Viro
-
- 07 10月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
In commit 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") we changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the executable mappings. Then, people reported that it broke some binaries that had overlapping segments from the same file, and commit ad55eac7 ("elf: enforce MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some overlaying elf segment cases. But only some - despite the summary line of that commit, it only did it when it also does a temporary brk vma for one obvious overlapping case. Now Russell King reports another overlapping case with old 32-bit x86 binaries, which doesn't trigger that limited case. End result: we had better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED. Yes, it's a sign of old binaries generated with old tool-chains, but we do pride ourselves on not breaking existing setups. This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp() and the old load_elf_library() use-cases, because nobody has reported breakage for those. Yet. Note that in all the cases seen so far, the overlapping elf sections seem to be just re-mapping of the same executable with different section attributes. We could possibly introduce a new MAP_FIXED_NOFILECHANGE flag or similar, which acts like NOREPLACE, but allows just remapping the same executable file using different protection flags. It's not clear that would make a huge difference to anything, but if people really hate that "elf remaps over previous maps" behavior, maybe at least a more limited form of remapping would alleviate some concerns. Alternatively, we should take a look at our elf_map() logic to see if we end up not mapping things properly the first time. In the meantime, this is the minimal "don't do that then" patch while people hopefully think about it more. Reported-by: NRussell King <linux@armlinux.org.uk> Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") Fixes: ad55eac7 ("elf: enforce MAP_FIXED on overlaying elf segments") Cc: Michal Hocko <mhocko@suse.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 10月, 2019 2 次提交
-
-
由 Linus Torvalds 提交于
This has been discussed several times, and now filesystem people are talking about doing it individually at the filesystem layer, so head that off at the pass and just do it in getdents{64}(). This is partially based on a patch by Jann Horn, but checks for NUL bytes as well, and somewhat simplified. There's also commentary about how it might be better if invalid names due to filesystem corruption don't cause an immediate failure, but only an error at the end of the readdir(), so that people can still see the filenames that are ok. There's also been discussion about just how much POSIX strictly speaking requires this since it's about filesystem corruption. It's really more "protect user space from bad behavior" as pointed out by Jann. But since Eric Biederman looked up the POSIX wording, here it is for context: "From readdir: The readdir() function shall return a pointer to a structure representing the directory entry at the current position in the directory stream specified by the argument dirp, and position the directory stream at the next entry. It shall return a null pointer upon reaching the end of the directory stream. The structure dirent defined in the <dirent.h> header describes a directory entry. From definitions: 3.129 Directory Entry (or Link) An object that associates a filename with a file. Several directory entries can associate names with the same file. ... 3.169 Filename A name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the slash character and the null byte. The filenames dot and dot-dot have special meaning. A filename is sometimes referred to as a 'pathname component'." Note that I didn't bother adding the checks to any legacy interfaces that nobody uses. Also note that if this ends up being noticeable as a performance regression, we can fix that to do a much more optimized model that checks for both NUL and '/' at the same time one word at a time. We haven't really tended to optimize 'memchr()', and it only checks for one pattern at a time anyway, and we really _should_ check for NUL too (but see the comment about "soft errors" in the code about why it currently only checks for '/') See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name lookup code looks for pathname terminating characters in parallel. Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/ Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jann Horn <jannh@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
We really should avoid the "__{get,put}_user()" functions entirely, because they can easily be mis-used and the original intent of being used for simple direct user accesses no longer holds in a post-SMAP/PAN world. Manually optimizing away the user access range check makes no sense any more, when the range check is generally much cheaper than the "enable user accesses" code that the __{get,put}_user() functions still need. So instead of __put_user(), use the unsafe_put_user() interface with user_access_{begin,end}() that really does generate better code these days, and which is generally a nicer interface. Under some loads, the multiple user writes that filldir() does are actually quite noticeable. This also makes the dirent name copy use unsafe_put_user() with a couple of macros. We do not want to make function calls with SMAP/PAN disabled, and the code this generates is quite good when the architecture uses "asm goto" for unsafe_put_user() like x86 does. Note that this doesn't bother with the legacy cases. Nobody should use them anyway, so performance doesn't really matter there. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 10月, 2019 1 次提交
-
-
由 Eric Sandeen 提交于
Today, put_compat_statfs64() disallows nearly any field value over 2^32 if f_bsize is only 32 bits, but that makes no sense. compat_statfs64 is there for the explicit purpose of providing 64-bit fields for f_files, f_ffree, etc. And f_bsize is always only 32 bits. As a result, 32-bit userspace gets -EOVERFLOW for i.e. large file counts even with -D_FILE_OFFSET_BITS=64 set. In reality, only f_bsize and f_frsize can legitimately overflow (fields like f_type and f_namelen should never be large), so test only those fields. This bug was discussed at length some time ago, and this is the proposal Al suggested at https://lkml.org/lkml/2018/8/6/640. It seemed to get dropped amid the discussion of other related changes, but this part seems obviously correct on its own, so I've picked it up and sent it, for expediency. Fixes: 64d2ab32 ("vfs: fix put_compat_statfs64() does not handle errors") Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 10月, 2019 4 次提交
-
-
由 Arnd Bergmann 提交于
All system calls use struct __kernel_timespec instead of the old struct timespec, but this one was just added with the old-style ABI. Change it now to enforce the use of __kernel_timespec, avoiding ABI confusion and the need for compat handlers on 32-bit architectures. Any user space caller will have to use __kernel_timespec now, but this is unambiguous and works for any C library regardless of the time_t definition. A nicer way to specify the timeout would have been a less ambiguous 64-bit nanosecond value, but I suppose it's too late now to change that as this would impact both 32-bit and 64-bit users. Fixes: 5262f567 ("io_uring: IORING_OP_TIMEOUT support") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Gao Xiang 提交于
Fix a recent cleanup patch. noio (bypass) chain is handled asynchronously against submit chain, therefore inplace I/O or pagevec cannot be applied to such pages. Add detailed comment for this as well. Fixes: 97e86a85 ("staging: erofs: tidy up decompression frontend") Reviewed-by: NChao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20190922100434.229340-1-gaoxiang25@huawei.comSigned-off-by: NGao Xiang <gaoxiang25@huawei.com>
-
由 Gao Xiang 提交于
After doing more drop_caches stress test on our products, I found the mistake introduced by a very recent cleanup [1]. The current rule is that "erofs_get_meta_page" should be returned with page locked (although it's mostly unnecessary for read-only fs after pages are PG_uptodate), but a fix should be done for this. [1] https://lore.kernel.org/r/20190904020912.63925-26-gaoxiang25@huawei.com Fixes: 618f40ea ("erofs: use read_cache_page_gfp for erofs_get_meta_page") Reviewed-by: NChao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20190921184355.149928-1-gaoxiang25@huawei.comSigned-off-by: NGao Xiang <gaoxiang25@huawei.com>
-
由 Wei Yongjun 提交于
In case of error, the function read_mapping_page() returns ERR_PTR() not NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: fe7c2423 ("erofs: use read_mapping_page instead of sb_bread") Reviewed-by: NGao Xiang <gaoxiang25@huawei.com> Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20190918083033.47780-1-weiyongjun1@huawei.comSigned-off-by: NGao Xiang <gaoxiang25@huawei.com>
-
- 30 9月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit 72dbcf72. Instead of waiting forever for entropy that may just not happen, we now try to actively generate entropy when required, and are thus hopefully avoiding the problem that caused the nice ext4 IO pattern fix to be reverted. So revert the revert. Cc: Ahmed S. Darwish <darwish.07@gmail.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Willy Tarreau <w@1wt.eu> Cc: Alexander E. Patrakov <patrakov@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 9月, 2019 4 次提交
-
-
由 Qu Wenruo 提交于
[BUG] The following script can cause btrfs qgroup data space leak: mkfs.btrfs -f $dev mount $dev -o nospace_cache $mnt btrfs subv create $mnt/subv btrfs quota en $mnt btrfs quota rescan -w $mnt btrfs qgroup limit 128m $mnt/subv for (( i = 0; i < 3; i++)); do # Create 3 64M holes for latter fallocate to fail truncate -s 192m $mnt/subv/file xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null sync # it's supposed to fail, and each failure will leak at least 64M # data space xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null rm $mnt/subv/file sync done # Shouldn't fail after we removed the file xfs_io -f -c "falloc 0 64m" $mnt/subv/file [CAUSE] Btrfs qgroup data reserve code allow multiple reservations to happen on a single extent_changeset: E.g: btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_1M); btrfs_qgroup_reserve_data(inode, &data_reserved, SZ_1M, SZ_2M); btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_4M); Btrfs qgroup code has its internal tracking to make sure we don't double-reserve in above example. The only pattern utilizing this feature is in the main while loop of btrfs_fallocate() function. However btrfs_qgroup_reserve_data()'s error handling has a bug in that on error it clears all ranges in the io_tree with EXTENT_QGROUP_RESERVED flag but doesn't free previously reserved bytes. This bug has a two fold effect: - Clearing EXTENT_QGROUP_RESERVED ranges This is the correct behavior, but it prevents btrfs_qgroup_check_reserved_leak() to catch the leakage as the detector is purely EXTENT_QGROUP_RESERVED flag based. - Leak the previously reserved data bytes. The bug manifests when N calls to btrfs_qgroup_reserve_data are made and the last one fails, leaking space reserved in the previous ones. [FIX] Also free previously reserved data bytes when btrfs_qgroup_reserve_data fails. Fixes: 52472553 ("btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: NQu Wenruo <wqu@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Qu Wenruo 提交于
[BUG] Under the following case with qgroup enabled, if some error happened after we have reserved delalloc space, then in error handling path, we could cause qgroup data space leakage: From btrfs_truncate_block() in inode.c: ret = btrfs_delalloc_reserve_space(inode, &data_reserved, block_start, blocksize); if (ret) goto out; again: page = find_or_create_page(mapping, index, mask); if (!page) { btrfs_delalloc_release_space(inode, data_reserved, block_start, blocksize, true); btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true); ret = -ENOMEM; goto out; } [CAUSE] In the above case, btrfs_delalloc_reserve_space() will call btrfs_qgroup_reserve_data() and mark the io_tree range with EXTENT_QGROUP_RESERVED flag. In the error handling path, we have the following call stack: btrfs_delalloc_release_space() |- btrfs_free_reserved_data_space() |- btrsf_qgroup_free_data() |- __btrfs_qgroup_release_data(reserved=@reserved, free=1) |- qgroup_free_reserved_data(reserved=@reserved) |- clear_record_extent_bits(); |- freed += changeset.bytes_changed; However due to a completion bug, qgroup_free_reserved_data() will clear EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other than the correct BTRFS_I(inode)->io_tree. Since io_failure_tree is never marked with that flag, btrfs_qgroup_free_data() will not free any data reserved space at all, causing a leakage. This type of error handling can only be triggered by errors outside of qgroup code. So EDQUOT error from qgroup can't trigger it. [FIX] Fix the wrong target io_tree. Reported-by: NJosef Bacik <josef@toxicpanda.com> Fixes: bc42bda2 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NQu Wenruo <wqu@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Pavel Shilovsky 提交于
There may be situations when a server negotiates SMB 2.1 protocol version or higher but responds to a CREATE request with an oplock rather than a lease. Currently the client doesn't handle such a case correctly: when another CREATE comes in the server sends an oplock break to the initial CREATE and the client doesn't send an ack back due to a wrong caching level being set (READ instead of RWH). Missing an oplock break ack makes the server wait until the break times out which dramatically increases the latency of the second CREATE. Fix this by properly detecting oplocks when using SMB 2.1 protocol version and higher. Cc: <stable@vger.kernel.org> Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com> Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
-
由 Steve French 提交于
Various SMB3 ACL related flags (for security descriptor and ACEs for example) were missing and some fields are different in SMB3 and CIFS. Update cifsacl.h definitions based on current MS-DTYP specification. Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com> Reviewed-by: NAurelien Aptel <aaptel@suse.com>
-