- 27 9月, 2020 2 次提交
-
-
由 Laurent Dufour 提交于
Patch series "mm: fix memory to node bad links in sysfs", v3. Sometimes, firmware may expose interleaved memory layout like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] In that case, we can see memory blocks assigned to multiple nodes in sysfs: $ ls -l /sys/devices/system/memory/memory21 total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 -rw-r--r-- 1 root root 65536 Aug 24 05:27 online -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index drwxr-xr-x 2 root root 0 Aug 24 05:27 power -r--r--r-- 1 root root 65536 Aug 24 05:27 removable -rw-r--r-- 1 root root 65536 Aug 24 05:27 state lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones The same applies in the node's directory with a memory21 link in both the node1 and node2's directory. This is wrong but doesn't prevent the system to run. However when later, one of these memory blocks is hot-unplugged and then hot-plugged, the system is detecting an inconsistency in the sysfs layout and a BUG_ON() is raised: kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This has been seen on PowerPC LPAR. The root cause of this issue is that when node's memory is registered, the range used can overlap another node's range, thus the memory block is registered to multiple nodes in sysfs. There are two issues here: (a) The sysfs memory and node's layouts are broken due to these multiple links (b) The link errors in link_mem_sections() should not lead to a system panic. To address (a) register_mem_sect_under_node should not rely on the system state to detect whether the link operation is triggered by a hot plug operation or not. This is addressed by the patches 1 and 2 of this series. Issue (b) will be addressed separately. This patch (of 2): The memmap_context enum is used to detect whether a memory operation is due to a hot-add operation or happening at boot time. Make it general to the hotplug operation and rename it as meminit_context. There is no functional change introduced by this patch Suggested-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NLaurent Dufour <ldufour@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J . Wysocki" <rafael@kernel.org> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasily Gorbik 提交于
Currently to make sure that every page table entry is read just once gup_fast walks perform READ_ONCE and pass pXd value down to the next gup_pXd_range function by value e.g.: static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) ... pudp = pud_offset(&p4d, addr); This function passes a reference on that local value copy to pXd_offset, and might get the very same pointer in return. This happens when the level is folded (on most arches), and that pointer should not be iterated. On s390 due to the fact that each task might have different 5,4 or 3-level address translation and hence different levels folded the logic is more complex and non-iteratable pointer to a local copy leads to severe problems. Here is an example of what happens with gup_fast on s390, for a task with 3-level paging, crossing a 2 GB pud boundary: // addr = 0x1007ffff000, end = 0x10080001000 static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) { unsigned long next; pud_t *pudp; // pud_offset returns &p4d itself (a pointer to a value on stack) pudp = pud_offset(&p4d, addr); do { // on second iteratation reading "random" stack value pud_t pud = READ_ONCE(*pudp); // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390 next = pud_addr_end(addr, end); ... } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack return 1; } This happens since s390 moved to common gup code with commit d1874a0c ("s390/mm: make the pxd_offset functions more robust") and commit 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code"). s390 tried to mimic static level folding by changing pXd_offset primitives to always calculate top level page table offset in pgd_offset and just return the value passed when pXd_offset has to act as folded. What is crucial for gup_fast and what has been overlooked is that PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly. And the latter is not possible with dynamic folding. To fix the issue in addition to pXd values pass original pXdp pointers down to gup_pXd_range functions. And introduce pXd_offset_lockless helpers, which take an additional pXd entry value parameter. This has already been discussed in https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1 Fixes: 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NGerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: NAlexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NMike Rapoport <rppt@linux.ibm.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.2+] Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hoursSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 9月, 2020 2 次提交
-
-
由 Jan Kara 提交于
dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy implementation should be defined only in !CONFIG_DAX case, not in !CONFIG_FS_DAX case. Fixes: e2ec5128 ("dm: Call proper helper to determine dax support") Cc: <stable@vger.kernel.org> Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org> Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Henry Ptasinski 提交于
When calculating ancestor_size with IPv6 enabled, simply using sizeof(struct ipv6_pinfo) doesn't account for extra bytes needed for alignment in the struct sctp6_sock. On x86, there aren't any extra bytes, but on ARM the ipv6_pinfo structure is aligned on an 8-byte boundary so there were 4 pad bytes that were omitted from the ancestor_size calculation. This would lead to corruption of the pd_lobby pointers, causing an oops when trying to free the sctp structure on socket close. Fixes: 636d25d5 ("sctp: not copy sctp_sock pd_lobby in sctp_copy_descendant") Signed-off-by: NHenry Ptasinski <hptasinski@google.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 9月, 2020 3 次提交
-
-
由 Jan Kara 提交于
DM was calling generic_fsdax_supported() to determine whether a device referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that they don't have to duplicate common generic checks. High level code should call dax_supported() helper which that calls into appropriate helper for the particular device. This problem manifested itself as kernel messages: dm-3: error: dax access failed (-95) when lvm2-testsuite run in cases where a DM device was stacked on top of another DM device. Fixes: 7bf7eac8 ("dax: Arrange for dax_supported check to span multiple devices") Cc: <stable@vger.kernel.org> Tested-by: NAdrian Huang <ahuang12@lenovo.com> Signed-off-by: NJan Kara <jack@suse.cz> Acked-by: NMike Snitzer <snitzer@redhat.com> Reported-by: Nkernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/160061715195.13131.5503173247632041975.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Tobias Klauser 提交于
Commit 32927393 ("sysctl: pass kernel pointers to ->proc_handler") changed ctl_table.proc_handler to take a kernel pointer. Adjust the signature of stack_erasing_sysctl to match ctl_table.proc_handler which fixes the following sparse warning: kernel/stackleak.c:31:50: warning: incorrect type in argument 3 (different address spaces) kernel/stackleak.c:31:50: expected void * kernel/stackleak.c:31:50: got void [noderef] __user *buffer Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler") Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20200907093253.13656-1-tklauser@distanz.chSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tobias Klauser 提交于
Commit 32927393 ("sysctl: pass kernel pointers to ->proc_handler") changed ctl_table.proc_handler to take a kernel pointer. Adjust the signature of ftrace_enable_sysctl to match ctl_table.proc_handler which fixes the following sparse warning: kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces) kernel/trace/ftrace.c:7544:43: expected void * kernel/trace/ftrace.c:7544:43: got void [noderef] __user *buffer Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler") Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.chSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 9月, 2020 4 次提交
-
-
由 Vladimir Oltean 提交于
Currently mscc_ocelot_init_ports() will skip initializing a port when it doesn't have a phy-handle, so the ocelot->ports[port] pointer will be NULL. Take this into consideration when tearing down the driver, and add a new function ocelot_deinit_port() to the switch library, mirror of ocelot_init_port(), which needs to be called by the driver for all ports it has initialized. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Tested-by: NAlexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
The ocelot_port->ts_id is used to: (a) populate skb->cb[0] for matching the TX timestamp in the PTP IRQ with an skb. (b) populate the REW_OP from the injection header of the ongoing skb. Only then is ocelot_port->ts_id incremented. This is a problem because, at least theoretically, another timestampable skb might use the same ocelot_port->ts_id before that is incremented. Normally all transmit calls are serialized by the netdev transmit spinlock, but in this case, ocelot_port_add_txtstamp_skb() is also called by DSA, which has started declaring the NETIF_F_LLTX feature since commit 2b86cb82 ("net: dsa: declare lockless TX feature for slave ports"). So the logic of using and incrementing the timestamp id should be atomic per port. The solution is to use the global ocelot_port->ts_id only while protected by the associated ocelot_port->ts_id_lock. That's where we populate skb->cb[0]. Note that for ocelot, ocelot_port_add_txtstamp_skb is called for the actual skb, but for felix, it is called for the skb's clone. That is something which will also be changed in the future. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NHoratiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Tested-by: NAlexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Masami Hiramatsu 提交于
Since kprobe_event= cmdline option allows user to put kprobes on the functions in initmem, kprobe has to make such probes gone after boot. Currently the probes on the init functions in modules will be handled by module callback, but the kernel init text isn't handled. Without this, kprobes may access non-exist text area to disable or remove it. Link: https://lkml.kernel.org/r/159972810544.428528.1839307531600646955.stgit@devnote2 Fixes: 970988e1 ("tracing/kprobe: Add kprobe_event= boot parameter") Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Tobias Klauser 提交于
Commit 32927393 ("sysctl: pass kernel pointers to ->proc_handler") changed ctl_table.proc_handler to take a kernel pointer. Adjust the signature of ftrace_enable_sysctl to match ctl_table.proc_handler which fixes the following sparse warning: kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces) kernel/trace/ftrace.c:7544:43: expected void * kernel/trace/ftrace.c:7544:43: got void [noderef] __user *buffer Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.ch Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler") Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 18 9月, 2020 3 次提交
-
-
由 Michal Kubecek 提交于
Tunnel offload info code uses ETHTOOL_MSG_TUNNEL_INFO_GET message type (cmd field in genetlink header) for replies to tunnel info netlink request, i.e. the same value as the request have. This is a problem because we are using two separate enums for userspace to kernel and kernel to userspace message types so that this ETHTOOL_MSG_TUNNEL_INFO_GET (28) collides with ETHTOOL_MSG_CABLE_TEST_TDR_NTF which is what message type 28 means for kernel to userspace messages. As the tunnel info request reached mainline in 5.9 merge window, we should still be able to fix the reply message type without breaking backward compatibility. Fixes: c7d759eb ("ethtool: add tunnel info interface") Signed-off-by: NMichal Kubecek <mkubecek@suse.cz> Reviewed-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Torvalds 提交于
Commit 2a9127fc ("mm: rewrite wait_on_page_bit_common() logic") made the page locking entirely fair, in that if a waiter came in while the lock was held, the lock would be transferred to the lockers strictly in order. That was intended to finally get rid of the long-reported watchdog failures that involved the page lock under extreme load, where a process could end up waiting essentially forever, as other page lockers stole the lock from under it. It also improved some benchmarks, but it ended up causing huge performance regressions on others, simply because fair lock behavior doesn't end up giving out the lock as aggressively, causing better worst-case latency, but potentially much worse average latencies and throughput. Instead of reverting that change entirely, this introduces a controlled amount of unfairness, with a sysctl knob to tune it if somebody needs to. But the default value should hopefully be good for any normal load, allowing a few rounds of lock stealing, but enforcing the strict ordering before the lock has been stolen too many times. There is also a hint from Matthieu Baerts that the fair page coloring may end up exposing an ABBA deadlock that is hidden by the usual optimistic lock stealing, and while the unfairness doesn't fix the fundamental issue (and I'm still looking at that), it avoids it in practice. The amount of unfairness can be modified by writing a new value to the 'sysctl_page_lock_unfairness' variable (default value of 5, exposed through /proc/sys/vm/page_lock_unfairness), but that is hopefully something we'd use mainly for debugging rather than being necessary for any deep system tuning. This whole issue has exposed just how critical the page lock can be, and how contended it gets under certain locks. And the main contention doesn't really seem to be anything related to IO (which was the origin of this lock), but for things like just verifying that the page file mapping is stable while faulting in the page into a page table. Link: https://lore.kernel.org/linux-fsdevel/ed8442fd-6f54-dd84-cd4a-941e8b7ee603@MichaelLarabel.com/ Link: https://www.phoronix.com/scan.php?page=article&item=linux-50-59&num=1 Link: https://lore.kernel.org/linux-fsdevel/c560a38d-8313-51fb-b1ec-e904bd8836bc@tessares.net/Reported-and-tested-by: NMichael Larabel <Michael@michaellarabel.com> Tested-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Cc: Dave Chinner <david@fromorbit.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Jan Kara <jack@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Jones 提交于
Steal time initialization requires mapping a memory region which invokes a memory allocation. Doing this at CPU starting time results in the following trace when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at mm/slab.h:498 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #1 Call trace: dump_backtrace+0x0/0x208 show_stack+0x1c/0x28 dump_stack+0xc4/0x11c ___might_sleep+0xf8/0x130 __might_sleep+0x58/0x90 slab_pre_alloc_hook.constprop.101+0xd0/0x118 kmem_cache_alloc_node_trace+0x84/0x270 __get_vm_area_node+0x88/0x210 get_vm_area_caller+0x38/0x40 __ioremap_caller+0x70/0xf8 ioremap_cache+0x78/0xb0 memremap+0x9c/0x1a8 init_stolen_time_cpu+0x54/0xf0 cpuhp_invoke_callback+0xa8/0x720 notify_cpu_starting+0xc8/0xd8 secondary_start_kernel+0x114/0x180 CPU1: Booted secondary processor 0x0000000001 [0x431f0a11] However we don't need to initialize steal time at CPU starting time. We can simply wait until CPU online time, just sacrificing a bit of accuracy by returning zero for steal time until we know better. While at it, add __init to the functions that are only called by pv_time_init() which is __init. Signed-off-by: NAndrew Jones <drjones@redhat.com> Fixes: e0685fa2 ("arm64: Retrieve stolen time as paravirtualized guest") Cc: stable@vger.kernel.org Reviewed-by: NSteven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200916154530.40809-1-drjones@redhat.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 17 9月, 2020 2 次提交
-
-
由 Alexey Dobriyan 提交于
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Peter Zijlstra 提交于
Some drivers have to do significant work, some of which relies on RCU still being active. Instead of using RCU_NONIDLE in the drivers and flipping RCU back on, allow drivers to take over RCU-idle duty. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org> Tested-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 16 9月, 2020 2 次提交
-
-
由 Hou Tao 提交于
The __this_cpu*() accessors are (in general) IRQ-unsafe which, given that percpu-rwsem is a blocking primitive, should be just fine. However, file_end_write() is used from IRQ context and will cause load-store issues on architectures where the per-cpu accessors are not natively irq-safe. Fix it by using the IRQ-safe this_cpu_*() for operations on read_count. This will generate more expensive code on a number of platforms, which might cause a performance regression for some of the other percpu-rwsem users. If any such is reported, we can consider alternative solutions. Fixes: 70fe2f48 ("aio: fix freeze protection of aio writes") Signed-off-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NWill Deacon <will@kernel.org> Acked-by: NOleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20200915140750.137881-1-houtao1@huawei.com
-
由 Johan Hovold 提交于
Fix the port-lock initialisation regression introduced by commit a3cb39d2 ("serial: core: Allow detach and attach serial device for console") by making sure that the lock is again initialised during console setup. The console may be registered before the serial controller has been probed in which case the port lock needs to be initialised during console setup by a call to uart_set_options(). The console-detach changes introduced a regression in several drivers by effectively removing that initialisation by not initialising the lock when the port is used as a console (which is always the case during console setup). Add back the early lock initialisation and instead use a new console-reinit flag to handle the case where a console is being re-attached through sysfs. The question whether the console-detach interface should have been added in the first place is left for another discussion. Note that the console-enabled check in uart_set_options() is not redundant because of kgdboc, which can end up reinitialising an already enabled console (see commit 42b6a1ba ("serial_core: Don't re-initialize a previously initialized spinlock.")). Fixes: a3cb39d2 ("serial: core: Allow detach and attach serial device for console") Cc: stable <stable@vger.kernel.org> # 5.7 Signed-off-by: NJohan Hovold <johan@kernel.org> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200909143101.15389-3-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 15 9月, 2020 2 次提交
-
-
由 Xin Long 提交于
As we can see from vxlan_build/parse_gbp_hdr(), when processing metadata on vxlan rx/tx path, only dont_learn/policy_applied/policy_id fields can be set to or parse from the packet for vxlan gbp option. So we'd better do the mask when set it in act_tunnel_key and cls_flower. Otherwise, when users don't know these bits, they may configure with a value which can never be matched. Reported-by: NShuang Li <shuali@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
flowi4_multipath_hash was added by the commit referenced below for tunnels. Unfortunately, the patch did not initialize the new field for several fast path lookups that do not initialize the entire flow struct to 0. Fix those locations. Currently, flowi4_multipath_hash is random garbage and affects the hash value computed by fib_multipath_hash for multipath selection. Fixes: 24ba1440 ("route: Add multipath_hash in flowi_common to make user-define hash") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Cc: wenxu <wenxu@ucloud.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 9月, 2020 1 次提交
-
-
由 Sergey Senozhatsky 提交于
The patch partially reverts some of the UAPI bits of the buffer cache management hints. Namely, the queue consistency (memory coherency) user-space hint because, as it turned out, the kernel implementation of this feature was misusing DMA_ATTR_NON_CONSISTENT. The patch reverts both kernel and user space parts: removes the DMA consistency attr functions, rolls back changes to v4l2_requestbuffers, v4l2_create_buffers structures and corresponding UAPI functions (plus compat32 layer) and cleans up the documentation. [hverkuil: fixed a few typos in the commit log] [hverkuil: fixed vb2_core_reqbufs call in drivers/media/dvb-core/dvb_vb2.c] [mchehab: fixed a typo in the commit log: revers->reverts] Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
-
- 12 9月, 2020 1 次提交
-
-
由 Huacai Chen 提交于
MIPS defines two kvm types: #define KVM_VM_MIPS_TE 0 #define KVM_VM_MIPS_VZ 1 In Documentation/virt/kvm/api.rst it is said that "You probably want to use 0 as machine type", which implies that type 0 be the "automatic" or "default" type. And, in user-space libvirt use the null-machine (with type 0) to detect the kvm capability, which returns "KVM not supported" on a VZ platform. I try to fix it in QEMU but it is ugly: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html And Thomas Huth suggests me to change the definition of kvm type: https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html So I define like this: #define KVM_VM_MIPS_AUTO 0 #define KVM_VM_MIPS_VZ 1 #define KVM_VM_MIPS_TE 2 Since VZ and TE cannot co-exists, using type 0 on a TE platform will still return success (so old user-space tools have no problems on new kernels); the advantage is that using type 0 on a VZ platform will not return failure. So, the only problem is "new user-space tools use type 2 on old kernels", but if we treat this as a kernel bug, we can backport this patch to old stable kernels. Signed-off-by: NHuacai Chen <chenhc@lemote.com> Message-Id: <1599734031-28746-1-git-send-email-chenhc@lemote.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 9月, 2020 5 次提交
-
-
由 Nicolas Dichtel 提交于
There is no @validate argument. CC: Johannes Berg <johannes.berg@intel.com> Fixes: 3de64403 ("netlink: re-add parse/validate functions in strict mode") Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Miaohe Lin 提交于
Remove the weird space inside the NETIF_F_CSUM_MASK. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amit Kucheria 提交于
Fix up the documentation of the struct powercap_control_type members to match the code. Also fixup stray whitespace. Signed-off-by: NAmit Kucheria <amitk@kernel.org> [ rjw: Changelog edits ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Randy Dunlap 提交于
Fix kernel-doc warning in <linux/device.h>: ../include/linux/device.h:613: warning: Function parameter or member 'em_pd' not described in 'device' Fixes: 1bc138c6 ("PM / EM: add support for other devices than CPUs in Energy Model") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Reviewed-by: NLukasz Luba <lukasz.luba@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Kees Cook 提交于
On non-EFI systems, it wasn't possible to test the platform firmware loader because it will have never set "checked_fw" during __init. Instead, allow the test code to override this check. Additionally split the declarations into a private symbol namespace so there is greater enforcement of the symbol visibility. Fixes: 548193cb ("test_firmware: add support for firmware_request_platform") Cc: stable@vger.kernel.org Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NArd Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200909225354.3118328-1-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 9月, 2020 2 次提交
-
-
由 Dmitry Bogdanov 提交于
In CMT and NPAR the PF is unknown when the GFS block processes the packet. Therefore cannot use searcher as it has a per PF database, and thus ARFS must be disabled. Fixes: d51e4af5 ("qed: aRFS infrastructure support") Signed-off-by: NManish Chopra <manishc@marvell.com> Signed-off-by: NIgor Russkikh <irusskikh@marvell.com> Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com> Signed-off-by: NDmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
skb_put_padto() and __skb_put_padto() callers must check return values or risk use-after-free. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 9月, 2020 1 次提交
-
-
由 Evan Nimmo 提交于
If something goes wrong (such as the SCL being stuck low) then we need to reset the PCA chip. The issue with this is that on reset we lose all config settings and the chip ends up in a disabled state which results in a lock up/high CPU usage. We need to re-apply any configuration that had previously been set and re-enable the chip. Signed-off-by: NEvan Nimmo <evan.nimmo@alliedtelesis.co.nz> Reviewed-by: NChris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NWolfram Sang <wsa@kernel.org>
-
- 08 9月, 2020 3 次提交
-
-
由 Pablo Neira Ayuso 提交于
On x86_64, each notification results in one skbuff allocation which consumes at least 768 bytes due to the skbuff overhead. This patch coalesces several notifications into one single skbuff, so each notification consumes at least ~211 bytes, that ~3.5 times less memory consumption. As a result, this is reducing the chances to exhaust the netlink socket receive buffer. Rule of thumb is that each notification batch only contains netlink messages whose report flag is the same, nfnetlink_send() requires this to do appropriate delivery to userspace, either via unicast (echo mode) or multicast (monitor mode). The skbuff control buffer is used to annotate the report flag for later handling at the new coalescing routine. The batch skbuff notification size is NLMSG_GOODSIZE, using a larger skbuff would allow for more socket receiver buffer savings (to amortize the cost of the skbuff even more), however, going over that size might break userspace applications, so let's be conservative and stick to NLMSG_GOODSIZE. Reported-by: NPhil Sutter <phil@nwl.cc> Acked-by: NPhil Sutter <phil@nwl.cc> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Randy Dunlap 提交于
Fix kernel-doc warning in <linux/netdevice.h>: ../include/linux/netdevice.h:2158: warning: Function parameter or member 'xdp_state' not described in 'net_device' Fixes: 7f0a8382 ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Andrii Nakryiko <andriin@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Randy Dunlap 提交于
Fix kernel-doc warning in <linux/netdevice.h>: ../include/linux/netdevice.h:2158: warning: Function parameter or member 'proto_down_reason' not described in 'net_device' Fixes: 829eb208 ("rtnetlink: add support for protodown reason") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 07 9月, 2020 2 次提交
-
-
由 Randy Dunlap 提交于
Fix kernel-doc warning in <linux/device.h>: ../include/linux/device.h:613: warning: Function parameter or member 'em_pd' not described in 'device' Fixes: 1bc138c6 ("PM / EM: add support for other devices than CPUs in Energy Model") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NLukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/d97f40ad-3033-703a-c3cb-2843ce0f6371@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Cezary Rojewski 提交于
Introduce for_each_rtd_dais_rollback macro which behaves exactly like for_each_codec_dais_rollback and its cpu_dais equivalent but for all dais instead. Use newly added macro to fix soc_pcm_open error path and prevent uninitialized dais from being cleaned-up. Signed-off-by: NCezary Rojewski <cezary.rojewski@intel.com> Fixes: 5d9fa03e ("ASoC: soc-pcm: tidyup soc_pcm_open() order") Acked-by: NLiam Girdwood <liam.r.girdwood@linux.intel.com> Acked-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/20200907111939.16169-1-cezary.rojewski@intel.comSigned-off-by: NMark Brown <broonie@kernel.org>
-
- 06 9月, 2020 1 次提交
-
-
由 Jason Gunthorpe 提交于
Otherwise gcc generates warnings if the expression is complicated. Fixes: 312a0c17 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 9月, 2020 2 次提交
-
-
由 Peter Xu 提交于
This accounts for wp_page_reuse() case, where we reused a page for COW. Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Xu 提交于
Remove the function as the last reference has gone away with the do_wp_page() changes. Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 9月, 2020 2 次提交
-
-
由 Jim Cromie 提交于
commit 4c0d7782 ("dyndbg: export ddebug_exec_queries") had a few problems: - broken non DYNAMIC_DEBUG_CORE configs, sparse warning - the exported function modifies query string, breaks on RO strings. - func name follows internal convention, shouldn't be exposed as is. 1st is fixed in header with ifdefd function prototype or stub defn. Also remove an obsolete HAVE-symbol ifdef-comment, and add others. Fix others by wrapping existing internal function with a new one, named in accordance with module-prefix naming convention, before export hits v5.9.0. In new function, copy query string to a local buffer, so users can pass hard-coded/RO queries, and internal function can be used unchanged. Fixes: 4c0d7782 ("dyndbg: export ddebug_exec_queries") Signed-off-by: NJim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20200831182210.850852-3-jim.cromie@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Thomas Gleixner 提交于
Andy reported that the syscall treacing for 32bit fast syscall fails: # ./tools/testing/selftests/x86/ptrace_syscall_32 ... [RUN] SYSEMU [FAIL] Initial args are wrong (nr=224, args=10 11 12 13 14 4289172732) ... [RUN] SYSCALL [FAIL] Initial args are wrong (nr=29, args=0 0 0 0 0 4289172732) The eason is that the conversion to generic entry code moved the retrieval of the sixth argument (EBP) after the point where the syscall entry work runs, i.e. ptrace, seccomp, audit... Unbreak it by providing a split up version of syscall_enter_from_user_mode(). - syscall_enter_from_user_mode_prepare() establishes state and enables interrupts - syscall_enter_from_user_mode_work() runs the entry work Replace the call to syscall_enter_from_user_mode() in the 32bit fast syscall C-entry with the split functions and stick the EBP retrieval between them. Fixes: 27d6b4d1 ("x86/entry: Use generic syscall entry function") Reported-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/87k0xdjbtt.fsf@nanos.tec.linutronix.de
-