- 07 6月, 2021 1 次提交
-
-
由 Jan Kara 提交于
Some users have pointed out that path-based syscalls are problematic in some environments and at least directory fd argument and possibly also resolve flags are desirable for such syscalls. Rather than reimplementing all details of pathname lookup and following where it may eventually evolve, let's go for full file descriptor based syscall similar to how ioctl(2) works since the beginning. Managing of quotas isn't performance sensitive so the extra overhead of open does not matter and we are able to consume O_PATH descriptors as well which makes open cheap anyway. Also for frequent operations (such as retrieving usage information for all users) we can reuse single fd and in fact get even better performance as well as avoiding races with possible remounts etc. Tested-by: NSascha Hauer <s.hauer@pengutronix.de> Acked-by: NChristian Brauner <christian.brauner@ubuntu.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 09 5月, 2021 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit cd2c7545. Alex reports that the commit causes corruption with LUKS on ext4. Revert it for now so that this can be investigated properly. Link: https://lore.kernel.org/linux-block/1620493841.bxdq8r5haw.none@localhost/Reported-by: NAlex Xu (Hello71) <alex_y_xu@yahoo.ca> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 5月, 2021 1 次提交
-
-
由 Masahiro Yamada 提交于
<linux/kconfig.h> is included from all the kernel-space source files, including C, assembly, linker scripts. It is intended to contain a minimal set of macros to evaluate CONFIG options. IF_ENABLED() is an intruder here because (x ? y : z) is C code, which should not be included from assembly files or linker scripts. Also, <linux/kconfig.h> is no longer self-contained because NULL is defined in <linux/stddef.h>. Move IF_ENABLED() out to <linux/kernel.h> as PTR_IF(). PTF_IF() takes the general boolean expression instead of a CONFIG option so that it fits better in <linux/kernel.h>. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Reviewed-by: NKees Cook <keescook@chromium.org>
-
- 07 5月, 2021 19 次提交
-
-
由 Ingo Molnar 提交于
Fix ~94 single-word typos in locking code comments, plus a few very obvious grammar mistakes. Link: https://lkml.kernel.org/r/20210322212624.GA1963421@gmail.com Link: https://lore.kernel.org/r/20210322205203.GB1959563@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masahiro Yamada 提交于
The section "19) Editor modelines and other cruft" in Documentation/process/coding-style.rst clearly says, "Do not include any of these in source files." I recently receive a patch to explicitly add a new one. Let's do treewide cleanups, otherwise some people follow the existing code and attempt to upstream their favoriate editor setups. It is even nicer if scripts/checkpatch.pl can check it. If we like to impose coding style in an editor-independent manner, I think editorconfig (patch [1]) is a saner solution. [1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/ Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.orgSigned-off-by: NMasahiro Yamada <masahiroy@kernel.org> Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> [auxdisplay] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bhaskar Chowdhury 提交于
Few spelling fixes throughout the file. Link: https://lkml.kernel.org/r/20210318201404.6380-1-unixbhaskar@gmail.comSigned-off-by: NBhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
The last user (/dev/kmem) is gone. Let's drop it. Link: https://lkml.kernel.org/r/20210324102351.6932-4-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Minchan Kim <minchan@kernel.org> Cc: huang ying <huang.ying.caritas@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Patch series "drivers/char: remove /dev/kmem for good". Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and memory ballooning, I started questioning the existence of /dev/kmem. Comparing it with the /proc/kcore implementation, it does not seem to be able to deal with things like a) Pages unmapped from the direct mapping (e.g., to be used by secretmem) -> kern_addr_valid(). virt_addr_valid() is not sufficient. b) Special cases like gart aperture memory that is not to be touched -> mem_pfn_is_ram() Unless I am missing something, it's at least broken in some cases and might fault/crash the machine. Looks like its existence has been questioned before in 2005 and 2010 [1], after ~11 additional years, it might make sense to revive the discussion. CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by mistake?). All distributions disable it: in Ubuntu it has been disabled for more than 10 years, in Debian since 2.6.31, in Fedora at least starting with FC3, in RHEL starting with RHEL4, in SUSE starting from 15sp2, and OpenSUSE has it disabled as well. 1) /dev/kmem was popular for rootkits [2] before it got disabled basically everywhere. Ubuntu documents [3] "There is no modern user of /dev/kmem any more beyond attackers using it to load kernel rootkits.". RHEL documents in a BZ [5] "it served no practical purpose other than to serve as a potential security problem or to enable binary module drivers to access structures/functions they shouldn't be touching" 2) /proc/kcore is a decent interface to have a controlled way to read kernel memory for debugging puposes. (will need some extensions to deal with memory offlining/unplug, memory ballooning, and poisoned pages, though) 3) It might be useful for corner case debugging [1]. KDB/KGDB might be a better fit, especially, to write random memory; harder to shoot yourself into the foot. 4) "Kernel Memory Editor" [4] hasn't seen any updates since 2000 and seems to be incompatible with 64bit [1]. For educational purposes, /proc/kcore might be used to monitor value updates -- or older kernels can be used. 5) It's broken on arm64, and therefore, completely disabled there. Looks like it's essentially unused and has been replaced by better suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's just remove it. [1] https://lwn.net/Articles/147901/ [2] https://www.linuxjournal.com/article/10505 [3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled [4] https://sourceforge.net/projects/kme/ [5] https://bugzilla.redhat.com/show_bug.cgi?id=154796 Link: https://lkml.kernel.org/r/20210324102351.6932-1-david@redhat.com Link: https://lkml.kernel.org/r/20210324102351.6932-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NKees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Alexander A. Klimov" <grandmaster@al2klimov.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Corentin Labbe <clabbe@baylibre.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Gregory Clement <gregory.clement@bootlin.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hillf Danton <hdanton@sina.com> Cc: huang ying <huang.ying.caritas@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: James Troup <james.troup@canonical.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kairui Song <kasong@redhat.com> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Liviu Dudau <liviu.dudau@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@kernel.org> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: openrisc@lists.librecores.org Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: "Pavel Machek (CIP)" <pavel@denx.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Robert Richter <rric@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Cc: sparclinux@vger.kernel.org Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Theodore Dubois <tblodt@icloud.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: William Cohen <wcohen@redhat.com> Cc: Xiaoming Ni <nixiaoming@huawei.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shijie Luo 提交于
fix some typos and code style problems in mm. gfp.h: s/MAXNODES/MAX_NUMNODES mmzone.h: s/then/than rmap.c: s/__vma_split()/__vma_adjust() swap.c: s/__mod_zone_page_stat/__mod_zone_page_state, s/is is/is swap_state.c: s/whoes/whose z3fold.c: code style problem fix in z3fold_unregister_migration zsmalloc.c: s/of/or, s/give/given Link: https://lkml.kernel.org/r/20210419083057.64820-1-luoshijie1@huawei.comSigned-off-by: NShijie Luo <luoshijie1@huawei.com> Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rasmus Villemoes 提交于
Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3. These two patches are independent, but better-together. The second is a rather trivial patch that simply allows the developer to change "/sbin/modprobe" to something else - e.g. the empty string, so that all request_module() during early boot return -ENOENT early, without even spawning a usermode helper, needlessly synchronizing with the initramfs unpacking. The first patch delegates decompressing the initramfs to a worker thread, allowing do_initcalls() in main.c to proceed to the device_ and late_ initcalls without waiting for that decompression (and populating of rootfs) to finish. Obviously, some of those later calls may rely on the initramfs being available, so I've added synchronization points in the firmware loader and usermodehelper paths - there might be other places that would need this, but so far no one has been able to think of any places I have missed. There's not much to win if most of the functionality needed during boot is only available as modules. But systems with a custom-made .config and initramfs can boot faster, partly due to utilizing more than one cpu earlier, partly by avoiding known-futile modprobe calls (which would still trigger synchronization with the initramfs unpacking, thus eliminating most of the first benefit). This patch (of 2): Most of the boot process doesn't actually need anything from the initramfs, until of course PID1 is to be executed. So instead of doing the decompressing and populating of the initramfs synchronously in populate_rootfs() itself, push that off to a worker thread. This is primarily motivated by an embedded ppc target, where unpacking even the rather modest sized initramfs takes 0.6 seconds, which is long enough that the external watchdog becomes unhappy that it doesn't get attention soon enough. By doing the initramfs decompression in a worker thread, we get to do the device_initcalls and hence start petting the watchdog much sooner. Normal desktops might benefit as well. On my mostly stock Ubuntu kernel, my initramfs is a 26M xz-compressed blob, decompressing to around 126M. That takes almost two seconds: [ 0.201454] Trying to unpack rootfs image as initramfs... [ 1.976633] Freeing initrd memory: 29416K Before this patch, these lines occur consecutively in dmesg. With this patch, the timestamps on these two lines is roughly the same as above, but with 172 lines inbetween - so more than one cpu has been kept busy doing work that would otherwise only happen after the populate_rootfs() finished. Should one of the initcalls done after rootfs_initcall time (i.e., device_ and late_ initcalls) need something from the initramfs (say, a kernel module or a firmware blob), it will simply wait for the initramfs unpacking to be done before proceeding, which should in theory make this completely safe. But if some driver pokes around in the filesystem directly and not via one of the official kernel interfaces (i.e. request_firmware*(), call_usermodehelper*) that theory may not hold - also, I certainly might have missed a spot when sprinkling wait_for_initramfs(). So there is an escape hatch in the form of an initramfs_async= command line parameter. Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dkSigned-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rasmus Villemoes 提交于
No callers in the tree. Link: https://lkml.kernel.org/r/20210309151723.1907838-2-linux@rasmusvillemoes.dkSigned-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yafang Shao 提交于
When I was implementing a latency analyzer tool by using task->delays and other things, I found an issue in delayacct. The issue is it should clear the target's flag instead of current's in delayacct_blkio_end(). When I git blame delayacct, I found there're some similar issues we have fixed in delayacct_blkio_end(). - Commit c96f5471 ("delayacct: Account blkio completion on the correct task") fixed the issue that it should account blkio completion on the target task instead of current. - Commit b512719f ("delayacct: fix crash in delayacct_blkio_end() after delayacct init failure") fixed the issue that it should check target task's delays instead of current task'. It seems that delayacct_blkio_{begin, end} are error prone. So I introduce a new paratmeter - the target task 'p' - to these helpers. After that change, the callsite will specifilly set the right task, which should make it less error prone. Link: https://lkml.kernel.org/r/20210414083720.24083-1-laoar.shao@gmail.comSigned-off-by: NYafang Shao <laoar.shao@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Josh Snyder <joshs@netflix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 He Ying 提交于
We found these warnings in kernel/panic.c by using sparse tool: warning: symbol 'panic_smp_self_stop' was not declared. warning: symbol 'nmi_panic_self_stop' was not declared. warning: symbol 'crash_smp_send_stop' was not declared. To avoid them, add declarations for these three functions in include/linux/smp.h. Link: https://lkml.kernel.org/r/20210316084150.75201-1-heying24@huawei.comSigned-off-by: NHe Ying <heying24@huawei.com> Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masahiro Yamada 提交于
compat_sys##name is declared twice, just one line below. With this removal SYSCALL_DEFINEx() (defined in <linux/syscalls.h>) and COMPAT_SYSCALL_DEFINEx() look symmetrical. Link: https://lkml.kernel.org/r/20210223114924.854794-1-masahiroy@kernel.orgSigned-off-by: NMasahiro Yamada <masahiroy@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Richard Fitzgerald 提交于
crc8() does not change the data passed to it, so the pointer argument should be declared const. This avoids callers that receive const data having to cast it to a non-const pointer to call crc8(). Link: https://lkml.kernel.org/r/20210329122409.3291-1-rf@opensource.cirrus.comSigned-off-by: NRichard Fitzgerald <rf@opensource.cirrus.com> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yury Norov 提交于
Similarly to bitmap functions, users would benefit if we'll handle a case of small-size bitmaps that fit into a single word. While here, move the find_last_bit() declaration to bitops/find.h where other find_*_bit() functions sit. Link: https://lkml.kernel.org/r/20210401003153.97325-11-yury.norov@gmail.comSigned-off-by: NYury Norov <yury.norov@gmail.com> Acked-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yury Norov 提交于
find_bit would also benefit from small_const_nbits() optimizations. The detailed comment is provided by Rasmus Villemoes. Link: https://lkml.kernel.org/r/20210401003153.97325-6-yury.norov@gmail.comSuggested-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NYury Norov <yury.norov@gmail.com> Acked-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rasmus Villemoes 提交于
init_groups is declared in both cred.h and init_task.h, but it is not actually referenced anywhere outside of cred.c where it is defined. So make it static and remove the declarations. Link: https://lkml.kernel.org/r/20210310220102.2484201-1-linux@rasmusvillemoes.dkSigned-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wan Jiabing 提交于
Declaring struct pt_regs is unnecessary. On the one hand, there is no function using it; on the other hand, struct pt_regs has been declared in linux/kernel.h. Remove them. Link: https://lkml.kernel.org/r/20210401104834.1009157-1-wanjiabing@vivo.comSigned-off-by: NWan Jiabing <wanjiabing@vivo.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andy Shevchenko 提交于
The bitmap.h header is used in a lot of code around the kernel. Besides that it includes kernel.h which sometimes makes a loop. The problem here is many unneeded loops that make header hell dependencies. For example, how may you move bitmap_zalloc() from C-file to the header? Currently it's impossible. And bitmap.h here is only the tip of an iceberg. kerne.h is a dump of everything that even has nothing in common at all. We may still have it, but in my new code I prefer to include only the headers that I want to use, without the bulk of unneeded kernel code. Break the loop by introducing align.h, including it in kernel.h and bitmap.h followed by replacing kernel.h with limits.h. Link: https://lkml.kernel.org/r/20210326170347.37441-1-andriy.shevchenko@linux.intel.comSigned-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: NYury Norov <yury.norov@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
My UEK-derived config has 1030 files depending on pagemap.h before this change. Afterwards, just 326 files need to be rebuilt when I touch pagemap.h. I think blkdev.h is probably included too widely, but untangling that dependency is harder and this solves my problem. x86 allmodconfig builds, but there may be implicit include problems on other architectures. Link: https://lkml.kernel.org/r/20210309195747.283796-1-willy@infradead.orgSigned-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Dan Williams <dan.j.williams@intel.com> [nvdimm] Acked-by: Jens Axboe <axboe@kernel.dk> [block] Reviewed-by: NChristoph Hellwig <hch@lst.de> Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: Martin K. Petersen <martin.petersen@oracle.com> [scsi] Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
Now that proc_ops are separate from file_operations and other operations it easy to check all instances to have ->proc_lseek hook and remove check in main code. Note: nonseekable_open() files naturally don't require ->proc_lseek. Garbage collect pde_lseek() function. [adobriyan@gmail.com: smoke test lseek()] Link: https://lkml.kernel.org/r/YG4OIhChOrVTPgdN@localhost.localdomain Link: https://lkml.kernel.org/r/YFYX0Bzwxlc7aBa/@localhost.localdomainSigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 5月, 2021 18 次提交
-
-
由 Arnd Bergmann 提交于
As of commit 966a9671 ("smp: Avoid using two cache lines for struct call_single_data"), the smp code prefers 32-byte aligned call_single_data objects for performance reasons, but the block layer includes an instance of this structure in the main 'struct request' that is more senstive to size than to performance here, see 4ccafe03 ("block: unalign call_single_data in struct request"). The result is a violation of the calling conventions that clang correctly points out: block/blk-mq.c:630:39: warning: passing 8-byte aligned argument to 32-byte aligned parameter 2 of 'smp_call_function_single_async' may result in an unaligned pointer access [-Walign-mismatch] smp_call_function_single_async(cpu, &rq->csd); It does seem that the usage of the call_single_data without cache line alignment should still be allowed by the smp code, so just change the function prototype so it accepts both, but leave the default alignment unchanged for the other users. This seems better to me than adding a local hack to shut up an otherwise correct warning in the caller. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJens Axboe <axboe@kernel.dk> Link: https://lkml.kernel.org/r/20210505211300.3174456-1-arnd@kernel.org
-
由 Sean Christopherson 提交于
Move the guest enter/exit wrappers to kvm_host.h so that KVM can manage its context tracking vs. vtime accounting without bleeding too many KVM details into the context tracking code. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210505002735.1684165-8-seanjc@google.com
-
由 Sean Christopherson 提交于
Consolidate the guest enter/exit wrappers, providing and tweaking stubs as needed. This will allow moving the wrappers under KVM without having to bleed #ifdefs into the soon-to-be KVM code. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210505002735.1684165-7-seanjc@google.com
-
由 Sean Christopherson 提交于
Provide separate helpers for guest enter vtime accounting (in addition to the existing guest exit helpers), and move all vtime accounting helpers to vtime.h where the existing #ifdef infrastructure can be leveraged to better delineate the different types of accounting. This will also allow future cleanups via deduplication of context tracking code. Opportunstically delete the vtime_account_kernel() stub now that all callers are wrapped with CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210505002735.1684165-6-seanjc@google.com
-
由 Sean Christopherson 提交于
Move the blob of external declarations (and their stubs) above the set of inline definitions (and their stubs) for vtime accounting. This will allow a future patch to bring in more inline definitions without also having to shuffle large chunks of code. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210505002735.1684165-5-seanjc@google.com
-
由 Wanpeng Li 提交于
Provide separate vtime accounting functions for guest exit instead of open coding the logic within the context tracking code. This will allow KVM x86 to handle vtime accounting slightly differently when using tick-based accounting. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Co-developed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210505002735.1684165-3-seanjc@google.com
-
由 Wanpeng Li 提交于
Provide separate context tracking helpers for guest exit, the standalone helpers will be called separately by KVM x86 in later patches to fix tick-based accounting. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Co-developed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210505002735.1684165-2-seanjc@google.com
-
由 Ira Weiny 提交于
Patch series "btrfs: Convert kmap/memset/kunmap to memzero_user()". Lifting memzero_user(), convert it to kmap_local_page() and then use it in btrfs. This patch (of 3): memzero_page() can replace the kmap/memset/kunmap pattern in other places in the code. While zero_user() has the same interface it is not the same call and its use should be limited and some of those calls may be better converted from zero_user() to memzero_page().[1] But that is not addressed in this series. Lift memzero_page() to highmem. [1] https://lore.kernel.org/lkml/CAHk-=wijdojzo56FzYqE5TOYw2Vws7ik3LEMGj9SPQaJJ+Z73Q@mail.gmail.com/ Link: https://lkml.kernel.org/r/20210309212137.2610186-1-ira.weiny@intel.com Link: https://lkml.kernel.org/r/20210309212137.2610186-2-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oscar Salvador 提交于
Physical memory hotadd has to allocate a memmap (struct page array) for the newly added memory section. Currently, alloc_pages_node() is used for those allocations. This has some disadvantages: a) an existing memory is consumed for that purpose (eg: ~2MB per 128MB memory section on x86_64) This can even lead to extreme cases where system goes OOM because the physically hotplugged memory depletes the available memory before it is onlined. b) if the whole node is movable then we have off-node struct pages which has performance drawbacks. c) It might be there are no PMD_ALIGNED chunks so memmap array gets populated with base pages. This can be improved when CONFIG_SPARSEMEM_VMEMMAP is enabled. Vmemap page tables can map arbitrary memory. That means that we can reserve a part of the physically hotadded memory to back vmemmap page tables. This implementation uses the beginning of the hotplugged memory for that purpose. There are some non-obviously things to consider though. Vmemmap pages are allocated/freed during the memory hotplug events (add_memory_resource(), try_remove_memory()) when the memory is added/removed. This means that the reserved physical range is not online although it is used. The most obvious side effect is that pfn_to_online_page() returns NULL for those pfns. The current design expects that this should be OK as the hotplugged memory is considered a garbage until it is onlined. For example hibernation wouldn't save the content of those vmmemmaps into the image so it wouldn't be restored on resume but this should be OK as there no real content to recover anyway while metadata is reachable from other data structures (e.g. vmemmap page tables). The reserved space is therefore (de)initialized during the {on,off}line events (mhp_{de}init_memmap_on_memory). That is done by extracting page allocator independent initialization from the regular onlining path. The primary reason to handle the reserved space outside of {on,off}line_pages is to make each initialization specific to the purpose rather than special case them in a single function. As per above, the functions that are introduced are: - mhp_init_memmap_on_memory: Initializes vmemmap pages by calling move_pfn_range_to_zone(), calls kasan_add_zero_shadow(), and onlines as many sections as vmemmap pages fully span. - mhp_deinit_memmap_on_memory: Offlines as many sections as vmemmap pages fully span, removes the range from zhe zone by remove_pfn_range_from_zone(), and calls kasan_remove_zero_shadow() for the range. The new function memory_block_online() calls mhp_init_memmap_on_memory() before doing the actual online_pages(). Should online_pages() fail, we clean up by calling mhp_deinit_memmap_on_memory(). Adjusting of present_pages is done at the end once we know that online_pages() succedeed. On offline, memory_block_offline() needs to unaccount vmemmap pages from present_pages() before calling offline_pages(). This is necessary because offline_pages() tears down some structures based on the fact whether the node or the zone become empty. If offline_pages() fails, we account back vmemmap pages. If it succeeds, we call mhp_deinit_memmap_on_memory(). Hot-remove: We need to be careful when removing memory, as adding and removing memory needs to be done with the same granularity. To check that this assumption is not violated, we check the memory range we want to remove and if a) any memory block has vmemmap pages and b) the range spans more than a single memory block, we scream out loud and refuse to proceed. If all is good and the range was using memmap on memory (aka vmemmap pages), we construct an altmap structure so free_hugepage_table does the right thing and calls vmem_altmap_free instead of free_pagetable. Link: https://lkml.kernel.org/r/20210421102701.25051-5-osalvador@suse.deSigned-off-by: NOscar Salvador <osalvador@suse.de> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Tatashin 提交于
We should not pin pages in ZONE_MOVABLE. Currently, we do not pin only movable CMA pages. Generalize the function that migrates CMA pages to migrate all movable pages. Use is_pinnable_page() to check which pages need to be migrated Link: https://lkml.kernel.org/r/20210215161349.246722-10-pasha.tatashin@soleen.comSigned-off-by: NPavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Tatashin 提交于
On some platforms ZERO_PAGE(0) might end-up in a movable zone. Do not migrate zero page in gup during longterm pinning as migration of zero page is not allowed. For example, in x86 QEMU with 16G of memory and kernelcore=5G parameter, I see the following: Boot#1: zero_pfn 0x48a8d zero_pfn zone: ZONE_DMA32 Boot#2: zero_pfn 0x20168d zero_pfn zone: ZONE_MOVABLE On x86, empty_zero_page is declared in .bss and depending on the loader may end up in different physical locations during boots. Also, move is_zero_pfn() my_zero_pfn() functions under CONFIG_MMU, because zero_pfn that they are using is declared in memory.c which is compiled with CONFIG_MMU. Link: https://lkml.kernel.org/r/20210215161349.246722-9-pasha.tatashin@soleen.comSigned-off-by: NPavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Tatashin 提交于
PF_MEMALLOC_PIN is only honored for CMA pages, extend this flag to work for any allocations from ZONE_MOVABLE by removing __GFP_MOVABLE from gfp_mask when this flag is passed in the current context. Add is_pinnable_page() to return true if page is in a pinnable page. A pinnable page is not in ZONE_MOVABLE and not of MIGRATE_CMA type. Link: https://lkml.kernel.org/r/20210215161349.246722-8-pasha.tatashin@soleen.comSigned-off-by: NPavel Tatashin <pasha.tatashin@soleen.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Tatashin 提交于
PF_MEMALLOC_NOCMA is used ot guarantee that the allocator will not return pages that might belong to CMA region. This is currently used for long term gup to make sure that such pins are not going to be done on any CMA pages. When PF_MEMALLOC_NOCMA has been introduced we haven't realized that it is focusing on CMA pages too much and that there is larger class of pages that need the same treatment. MOVABLE zone cannot contain any long term pins as well so it makes sense to reuse and redefine this flag for that usecase as well. Rename the flag to PF_MEMALLOC_PIN which defines an allocation context which can only get pages suitable for long-term pins. Also rename: memalloc_nocma_save()/memalloc_nocma_restore to memalloc_pin_save()/memalloc_pin_restore() and make the new functions common. [rppt@linux.ibm.com: fix renaming of PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN] Link: https://lkml.kernel.org/r/20210331163816.11517-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20210215161349.246722-6-pasha.tatashin@soleen.comSigned-off-by: NPavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Saravanan D 提交于
To help with debugging the sluggishness caused by TLB miss/reload, we introduce monotonic hugepage [direct mapped] split event counts since system state: SYSTEM_RUNNING to be displayed as part of /proc/vmstat in x86 servers The lifetime split event information will be displayed at the bottom of /proc/vmstat .... swap_ra 0 swap_ra_hit 0 direct_map_level2_splits 94 direct_map_level3_splits 4 nr_unstable 0 .... One of the many lasting sources of direct hugepage splits is kernel tracing (kprobes, tracepoints). Note that the kernel's code segment [512 MB] points to the same physical addresses that have been already mapped in the kernel's direct mapping range. Source : Documentation/x86/x86_64/mm.rst When we enable kernel tracing, the kernel has to modify attributes/permissions of the text segment hugepages that are direct mapped causing them to split. Kernel's direct mapped hugepages do not coalesce back after split and remain in place for the remainder of the lifetime. An instance of direct page splits when we turn on dynamic kernel tracing .... cat /proc/vmstat | grep -i direct_map_level direct_map_level2_splits 784 direct_map_level3_splits 12 bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @ [pid, comm] = count(); }' cat /proc/vmstat | grep -i direct_map_level direct_map_level2_splits 789 direct_map_level3_splits 12 .... Link: https://lkml.kernel.org/r/20210218235744.1040634-1-saravanand@fb.comSigned-off-by: NSaravanan D <saravanand@fb.com> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
size_t in cma_alloc is confusing since it makes people think it's byte count, not pages. Change it to unsigned long[1]. The unsigned int in cma_release is also not right so change it. Since we have unsigned long in cma_release, free_contig_range should also respect it. [1] 67a2e213, mm: cma: fix incorrect type conversion for size during dma allocation Link: https://lore.kernel.org/linux-mm/20210324043434.GP1719932@casper.infradead.org/ Link: https://lkml.kernel.org/r/20210331164018.710560-1-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Since CMA is used more widely, it's worth to have CMA allocation statistics into vmstat. With it, we could know how agressively system uses cma allocation and how often it fails. Link: https://lkml.kernel.org/r/20210302183346.3707237-1-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Cc: John Dias <joaodias@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miaohe Lin 提交于
Patch series "Cleanup and fixup for mm/migrate.c", v3. This series contains cleanups to remove unnecessary VM_BUG_ON_PAGE and rc != MIGRATEPAGE_SUCCESS check. Also use helper function to remove some duplicated codes. What's more, this fixes potential deadlock in NUMA balancing shared exec THP case and so on. More details can be found in the respective changelogs. This patch (of 5): The putback_movable_page() is just called by putback_movable_pages() and we know the page is locked and both PageMovable() and PageIsolated() is checked right before calling putback_movable_page(). So we make it static and remove all the 3 VM_BUG_ON_PAGE(). Link: https://lkml.kernel.org/r/20210325131524.48181-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20210325131524.48181-2-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NYang Shi <shy828301@gmail.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Pages containing buffer_heads that are in one of the per-CPU buffer_head LRU caches will be pinned and thus cannot be migrated. This can prevent CMA allocations from succeeding, which are often used on platforms with co-processors (such as a DSP) that can only use physically contiguous memory. It can also prevent memory hot-unplugging from succeeding, which involves migrating at least MIN_MEMORY_BLOCK_SIZE bytes of memory, which ranges from 8 MiB to 1 GiB based on the architecture in use. Correspondingly, invalidate the BH LRU caches before a migration starts and stop any buffer_head from being cached in the LRU caches, until migration has finished. Link: https://lkml.kernel.org/r/20210319175127.886124-3-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org> Reported-by: NChris Goldsworthy <cgoldswo@codeaurora.org> Reported-by: NLaura Abbott <labbott@kernel.org> Tested-by: NOliver Sang <oliver.sang@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: John Dias <joaodias@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-