- 10 7月, 2019 40 次提交
-
-
由 Wanpeng Li 提交于
commit bb34e690e9340bc155ebed5a3d75fc63ff69e082 upstream. Thomas reported that: | Background: | | In preparation of supporting IPI shorthands I changed the CPU offline | code to software disable the local APIC instead of just masking it. | That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV | register. | | Failure: | | When the CPU comes back online the startup code triggers occasionally | the warning in apic_pending_intr_clear(). That complains that the IRRs | are not empty. | | The offending vector is the local APIC timer vector who's IRR bit is set | and stays set. | | It took me quite some time to reproduce the issue locally, but now I can | see what happens. | | It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1 | (and hardware support) it behaves correctly. | | Here is the series of events: | | Guest CPU | | goes down | | native_cpu_disable() | | apic_soft_disable(); | | play_dead() | | .... | | startup() | | if (apic_enabled()) | apic_pending_intr_clear() <- Not taken | | enable APIC | | apic_pending_intr_clear() <- Triggers warning because IRR is stale | | When this happens then the deadline timer or the regular APIC timer - | happens with both, has fired shortly before the APIC is disabled, but the | interrupt was not serviced because the guest CPU was in an interrupt | disabled region at that point. | | The state of the timer vector ISR/IRR bits: | | ISR IRR | before apic_soft_disable() 0 1 | after apic_soft_disable() 0 1 | | On startup 0 1 | | Now one would assume that the IRR is cleared after the INIT reset, but this | happens only on CPU0. | | Why? | | Because our CPU0 hotplug is just for testing to make sure nothing breaks | and goes through an NMI wakeup vehicle because INIT would send it through | the boots-trap code which is not really working if that CPU was not | physically unplugged. | | Now looking at a real world APIC the situation in that case is: | | ISR IRR | before apic_soft_disable() 0 1 | after apic_soft_disable() 0 1 | | On startup 0 0 | | Why? | | Once the dying CPU reenables interrupts the pending interrupt gets | delivered as a spurious interupt and then the state is clear. | | While that CPU0 hotplug test case is surely an esoteric issue, the APIC | emulation is still wrong, Even if the play_dead() code would not enable | interrupts then the pending IRR bit would turn into an ISR .. interrupt | when the APIC is reenabled on startup. From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled * Pending interrupts in the IRR and ISR registers are held and require masking or handling by the CPU. In Thomas's testing, hardware cpu will not respect soft disable LAPIC when IRR has already been set or APICv posted-interrupt is in flight, so we can skip soft disable APIC checking when clearing IRR and set ISR, continue to respect soft disable APIC when attempting to set IRR. Reported-by: NRong Chen <rong.a.chen@intel.com> Reported-by: NFeng Tang <feng.tang@intel.com> Reported-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NThomas Gleixner <tglx@linutronix.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rong Chen <rong.a.chen@intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Paolo Bonzini 提交于
commit 3f16a5c318392cbb5a0c7a3d19dff8c8ef3c38ee upstream. This warning can be triggered easily by userspace, so it should certainly not cause a panic if panic_on_warn is set. Reported-by: syzbot+c03f30b4f4c46bdf8575@syzkaller.appspotmail.com Suggested-by: NAlexander Potapenko <glider@google.com> Acked-by: NAlexander Potapenko <glider@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Guillaume Nault 提交于
[ Upstream commit 8a3dca632538c550930ce8bafa8c906b130d35cf ] When fixing the skb leak introduced by the conversion to rbtree, I forgot about the special case of duplicate fragments. The condition under the 'insert_error' label isn't effective anymore as nf_ct_frg6_gather() doesn't override the returned value anymore. So duplicate fragments now get NF_DROP verdict. To accept duplicate fragments again, handle them specially as soon as inet_frag_queue_insert() reports them. Return -EINPROGRESS which will translate to NF_STOLEN verdict, like any accepted fragment. However, such packets don't carry any new information and aren't queued, so we just drop them immediately. Fixes: a0d56cb911ca ("netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments") Signed-off-by: NGuillaume Nault <gnault@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Daniel Borkmann 提交于
[ Upstream commit fdadd04931c2d7cd294dc5b2b342863f94be53a3 ] Michael and Sandipan report: Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF JIT allocations. At compile time it defaults to PAGE_SIZE * 40000, and is adjusted again at init time if MODULES_VADDR is defined. For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with the compile-time default at boot-time, which is 0x9c400000 when using 64K page size. This overflows the signed 32-bit bpf_jit_limit value: root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit -1673527296 and can cause various unexpected failures throughout the network stack. In one case `strace dhclient eth0` reported: setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8}, 16) = -1 ENOTSUPP (Unknown error 524) and similar failures can be seen with tools like tcpdump. This doesn't always reproduce however, and I'm not sure why. The more consistent failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9 host would time out on systemd/netplan configuring a virtio-net NIC with no noticeable errors in the logs. Given this and also given that in near future some architectures like arm64 will have a custom area for BPF JIT image allocations we should get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For 4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec() so therefore add another overridable bpf_jit_alloc_exec_limit() helper function which returns the possible size of the memory area for deriving the default heuristic in bpf_jit_charge_init(). Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default JIT memory provider, and therefore in case archs implement their custom module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}. Additionally, for archs supporting large page sizes, we should change the sysctl to be handled as long to not run into sysctl restrictions in future. Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations") Reported-by: NSandipan Das <sandipan@linux.ibm.com> Reported-by: NMichael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Tested-by: NMichael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Colin Ian King 提交于
[ Upstream commit ea401685a20b5d631957f024bda86e1f6118eb20 ] Currently mskid is unsigned and hence comparisons with negative error return values are always false. Fix this by making mskid an int. Fixes: f058e46855dc ("net: hns: fix ICMP6 neighbor solicitation messages discard problem") Addresses-Coverity: ("Operands don't affect result") Signed-off-by: NColin Ian King <colin.king@canonical.com> Reviewed-by: NMukesh Ojha <mojha@codeaurora.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Guoqing Jiang 提交于
[ Upstream commit e00164a0f000de893944981f41a568c981aca658 ] err_spi is used when SERIAL_SC16IS7XX_SPI is enabled, so make the label only available under SERIAL_SC16IS7XX_SPI option. Otherwise, the below warning appears. drivers/tty/serial/sc16is7xx.c:1523:1: warning: label ‘err_spi’ defined but not used [-Wunused-label] err_spi: ^~~~~~~ Signed-off-by: NGuoqing Jiang <gqjiang@suse.com> Fixes: ac0cdb3d9901 ("sc16is7xx: missing unregister/delete driver on error in sc16is7xx_init()") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Guillaume Nault 提交于
[ Upstream commit a0d56cb911ca301de81735f1d73c2aab424654ba ] With commit 997dd9647164 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c"), nf_ct_frag6_reasm() is now called from nf_ct_frag6_queue(). With this change, nf_ct_frag6_queue() can fail after the skb has been added to the fragment queue and nf_ct_frag6_gather() was adapted to handle this case. But nf_ct_frag6_queue() can still fail before the fragment has been queued. nf_ct_frag6_gather() can't handle this case anymore, because it has no way to know if nf_ct_frag6_queue() queued the fragment before failing. If it didn't, the skb is lost as the error code is overwritten with -EINPROGRESS. Fix this by setting -EINPROGRESS directly in nf_ct_frag6_queue(), so that nf_ct_frag6_gather() can propagate the error as is. Fixes: 997dd9647164 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c") Signed-off-by: NGuillaume Nault <gnault@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Eric Dumazet 提交于
[ Upstream commit 47d3d7fdb10a21c223036b58bd70ffdc24a472c4 ] Since ip6frag_expire_frag_queue() now pulls the head skb from frag queue, we should no longer use skb_get(), since this leads to an skb leak. Stefan Bader initially reported a problem in 4.4.stable [1] caused by the skb_get(), so this patch should also fix this issue. 296583.091021] kernel BUG at /build/linux-6VmqmP/linux-4.4.0/net/core/skbuff.c:1207! [296583.091734] Call Trace: [296583.091749] [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 [296583.091764] [<ffffffff8183939a>] _decode_session6+0x26a/0x400 [296583.091779] [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 [296583.091795] [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 [296583.091809] [<ffffffff81824421>] icmp6_send+0x5e1/0x940 [296583.091823] [<ffffffff81753238>] ? __netif_receive_skb+0x18/0x60 [296583.091838] [<ffffffff817532b2>] ? netif_receive_skb_internal+0x32/0xa0 [296583.091858] [<ffffffffc0199f74>] ? ixgbe_clean_rx_irq+0x594/0xac0 [ixgbe] [296583.091876] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091893] [<ffffffff8183d431>] icmpv6_send+0x21/0x30 [296583.091906] [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 [296583.091921] [<ffffffffc04eb27f>] nf_ct_frag6_expire+0x1f/0x30 [nf_defrag_ipv6] [296583.091938] [<ffffffff810f3b57>] call_timer_fn+0x37/0x140 [296583.091951] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091968] [<ffffffff810f5464>] run_timer_softirq+0x234/0x330 [296583.091982] [<ffffffff8108a339>] __do_softirq+0x109/0x2b0 Fixes: d4289fcc9b16 ("net: IP6 defrag: use rbtrees for IPv6 defrag") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NStefan Bader <stefan.bader@canonical.com> Cc: Peter Oskolkov <posk@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David S. Miller 提交于
[ Upstream commit d84e7bc0595a7e146ad0ddb80b240cea77825245 ] >> net/rds/send.c:1109:42: warning: Using plain integer as NULL pointer Fixes: ea010070d0a7 ("net/rds: fix warn in rds_message_alloc_sgs") Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Takashi Iwai 提交于
[ Upstream commit 183ab39eb0ea9879bb68422a83e65f750f3192f0 ] The recent commit 98081ca62cba ("ALSA: hda - Record the current power state before suspend/resume calls") made the HD-audio driver to store the PM state in power_state field. This forgot, however, the initialization at power up. Although the codec drivers usually don't need to refer to this field in the normal operation, let's initialize it properly for consistency. Fixes: 98081ca62cba ("ALSA: hda - Record the current power state before suspend/resume calls") Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Salil Mehta 提交于
[ Upstream commit 4d96e13ee9cd1f7f801e8c7f4b12f09d1da4a5d8 ] This patch fixes the missing device reference release-after-use in the positive leg of the roce reset API of the HNS DSAF. Fixes: c969c6e7ab8c ("net: hns: Fix object reference leaks in hns_dsaf_roce_reset()") Reported-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NSalil Mehta <salil.mehta@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Kirill A. Shutemov 提交于
[ Upstream commit 45b13b424faafb81c8c44541f093a682fdabdefc ] RDMSR in the trampoline code overwrites EDX but that register is used to indicate whether 5-level paging has to be enabled and if clobbered, leads to failure to boot on a 5-level paging machine. Preserve EDX on the stack while we are dealing with EFER. Fixes: b677dfae5aa1 ("x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode") Reported-by: NKyle D Pelton <kyle.d.pelton@intel.com> Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: dave.hansen@linux.intel.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190206115253.1907-1-kirill.shutemov@linux.intel.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Ahern 提交于
[ Upstream commit 15d55bae4e3c43cd9f87fd93c73a263e172d34e1 ] A recent commit returns an error if icmp is used as the ip-proto for IPv6 fib rules. Update fib_rule_tests to send ipv6-icmp instead of icmp. Fixes: 5e1a99eae8499 ("ipv4: Add ICMPv6 support when parse route ipproto") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Xiubo Li 提交于
[ Upstream commit 40d883b091758472c79b81fa1c0e0347e24a9cff ] Fixes: a94a2572b977 ("scsi: tcmu: avoid cmd/qfull timers updated whenever a new cmd comes") Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Wei Yongjun 提交于
[ Upstream commit f2ffff085d287eec499f1fccd682796ad8010303 ] spin_lock_bh() is used in table_path_del() but rcu_read_unlock() is used for unlocking. Fix it by using spin_unlock_bh() instead of rcu_read_unlock() in the error handling case. Fixes: b4c3fbe63601 ("mac80211: Use linked list instead of rhashtable walk for mesh tables") Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Jaegeuk Kim 提交于
[ Upstream commit 7c77bf7de1574ac7a31a2b76f4927404307d13e7 ] This fixes wrong access of address spaces of node and meta inodes after iput. Fixes: 60aa4d5536ab ("f2fs: fix use-after-free issue when accessing sbi->stat_info") Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Noralf Trønnes 提交于
[ Upstream commit 6ab20a05f4c7ed45632e24d5397d6284e192567d ] It's now safe to let fbcon unbind automatically on fbdev unregister. The crash problem was fixed in commit 2122b40580dd ("fbdev: fbcon: Fix unregister crash when more than one framebuffer") Signed-off-by: NNoralf Trønnes <noralf@tronnes.org> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20190210131039.52664-13-noralf@tronnes.orgSigned-off-by: NSasha Levin <sashal@kernel.org>
-
由 Marek Szyprowski 提交于
[ Upstream commit 1e0d0a5fd38192f23304ea2fc2b531fea7c74247 ] Virtual MFC codec's child devices must not be assigned to platform bus, because they are allocated as raw 'struct device' and don't have the corresponding 'platform' part. This fixes NULL pointer access revealed recently by commit a66d972465d1 ("devres: Align data[] to ARCH_KMALLOC_MINALIGN"). Fixes: c79667dd ("media: s5p-mfc: replace custom reserved memory handling code with generic one") Reported-by: NPaweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Tested-by: NPaweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Ursula Braun 提交于
[ Upstream commit f61bca58f6c36e666c2b807697f25e5e98708162 ] Commit <26d92e951fe0> ("net/smc: move unhash as early as possible in smc_release()") fixes one occurrence in the smc code, but the same pattern exists in other places. This patch covers the remaining occurrences and makes sure, the unhash operation is done before the smc->clcsock is released. This avoids a potential use-after-free in smc_diag_dump(). Reviewed-by: NKarsten Graul <kgraul@linux.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Ido Schimmel 提交于
[ Upstream commit e149113a74c35f0a28d1bfe17d2505a03563c1d5 ] In commit 993107fea5ee ("mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl") I fixed a bug caused by the fact that the driver views differently the deletion of a VLAN device when it is deleted via an ioctl and netlink. Instead of relying on a specific order of events (device being unregistered vs. VLAN filter being updated), simply make sure that the driver performs the necessary cleanup when the VLAN device is unlinked, which always happens before the other two events. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Reviewed-by: NPetr Machata <petrm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Linus Torvalds 提交于
[ Upstream commit 423ea3255424b954947d167681b71ded1b8fca53 ] Make the forward declaration actually match the real function definition, something that previous versions of gcc had just ignored. This is another patch to fix new warnings from gcc-9 before I start the merge window pulls. I don't want to miss legitimate new warnings just because my system update brought a new compiler with new warnings. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Nikolay Borisov 提交于
commit debd1c065d2037919a7da67baf55cc683fee09f0 upstream. Recent FITRIM work, namely bbbf7243d62d ("btrfs: combine device update operations during transaction commit") combined the way certain operations are recoded in a transaction. As a result an ASSERT was added in dev_replace_finish to ensure the new code works correctly. Unfortunately I got reports that it's possible to trigger the assert, meaning that during a device replace it's possible to have an unfinished chunk allocation on the source device. This is supposed to be prevented by the fact that a transaction is committed before finishing the replace oepration and alter acquiring the chunk mutex. This is not sufficient since by the time the transaction is committed and the chunk mutex acquired it's possible to allocate a chunk depending on the workload being executed on the replaced device. This bug has been present ever since device replace was introduced but there was never code which checks for it. The correct way to fix is to ensure that there is no pending device modification operation when the chunk mutex is acquire and if there is repeat transaction commit. Unfortunately it's not possible to just exclude the source device from btrfs_fs_devices::dev_alloc_list since this causes ENOSPC to be hit in transaction commit. Fixing that in another way would need to add special cases to handle the last writes and forbid new ones. The looped transaction fix is more obvious, and can be easily backported. The runtime of dev-replace is long so there's no noticeable delay caused by that. Reported-by: NDavid Sterba <dsterba@suse.com> Fixes: 391cd9df ("Btrfs: fix unprotected alloc list insertion during the finishing procedure of replace") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Shakeel Butt 提交于
commit dffcac2cb88e4ec5906235d64a83d802580b119e upstream. In production we have noticed hard lockups on large machines running large jobs due to kswaps hoarding lru lock within isolate_lru_pages when sc->reclaim_idx is 0 which is a small zone. The lru was couple hundred GiBs and the condition (page_zonenum(page) > sc->reclaim_idx) in isolate_lru_pages() was basically skipping GiBs of pages while holding the LRU spinlock with interrupt disabled. On further inspection, it seems like there are two issues: (1) If kswapd on the return from balance_pgdat() could not sleep (i.e. node is still unbalanced), the classzone_idx is unintentionally set to 0 and the whole reclaim cycle of kswapd will try to reclaim only the lowest and smallest zone while traversing the whole memory. (2) Fundamentally isolate_lru_pages() is really bad when the allocation has woken kswapd for a smaller zone on a very large machine running very large jobs. It can hoard the LRU spinlock while skipping over 100s of GiBs of pages. This patch only fixes (1). (2) needs a more fundamental solution. To fix (1), in the kswapd context, if pgdat->kswapd_classzone_idx is invalid use the classzone_idx of the previous kswapd loop otherwise use the one the waker has requested. Link: http://lkml.kernel.org/r/20190701201847.251028-1-shakeelb@google.com Fixes: e716f2eb ("mm, vmscan: prevent kswapd sleeping prematurely due to mismatched classzone_idx") Signed-off-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Acked-by: NMel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hdanton@sina.com> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Petr Mladek 提交于
commit d5b844a2cf507fc7642c9ae80a9d585db3065c28 upstream. The commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") causes a possible deadlock between register_kprobe() and ftrace_run_update_code() when ftrace is using stop_machine(). The existing dependency chain (in reverse order) is: -> #1 (text_mutex){+.+.}: validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 __mutex_lock+0x88/0x908 mutex_lock_nested+0x32/0x40 register_kprobe+0x254/0x658 init_kprobes+0x11a/0x168 do_one_initcall+0x70/0x318 kernel_init_freeable+0x456/0x508 kernel_init+0x22/0x150 ret_from_fork+0x30/0x34 kernel_thread_starter+0x0/0xc -> #0 (cpu_hotplug_lock.rw_sem){++++}: check_prev_add+0x90c/0xde0 validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 cpus_read_lock+0x62/0xd0 stop_machine+0x2e/0x60 arch_ftrace_update_code+0x2e/0x40 ftrace_run_update_code+0x40/0xa0 ftrace_startup+0xb2/0x168 register_ftrace_function+0x64/0x88 klp_patch_object+0x1a2/0x290 klp_enable_patch+0x554/0x980 do_one_initcall+0x70/0x318 do_init_module+0x6e/0x250 load_module+0x1782/0x1990 __s390x_sys_finit_module+0xaa/0xf0 system_call+0xd8/0x2d0 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); It is similar problem that has been solved by the commit 2d1e38f5 ("kprobes: Cure hotplug lock ordering issues"). Many locks are involved. To be on the safe side, text_mutex must become a low level lock taken after cpu_hotplug_lock.rw_sem. This can't be achieved easily with the current ftrace design. For example, arm calls set_all_modules_text_rw() already in ftrace_arch_code_modify_prepare(), see arch/arm/kernel/ftrace.c. This functions is called: + outside stop_machine() from ftrace_run_update_code() + without stop_machine() from ftrace_module_enable() Fortunately, the problematic fix is needed only on x86_64. It is the only architecture that calls set_all_modules_text_rw() in ftrace path and supports livepatching at the same time. Therefore it is enough to move text_mutex handling from the generic kernel/trace/ftrace.c into arch/x86/kernel/ftrace.c: ftrace_arch_code_modify_prepare() ftrace_arch_code_modify_post_process() This patch basically reverts the ftrace part of the problematic commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race"). And provides x86_64 specific-fix. Some refactoring of the ftrace code will be needed when livepatching is implemented for arm or nds32. These architectures call set_all_modules_text_rw() and use stop_machine() at the same time. Link: http://lkml.kernel.org/r/20190627081334.12793-1-pmladek@suse.com Fixes: 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") Acked-by: NThomas Gleixner <tglx@linutronix.de> Reported-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NPetr Mladek <pmladek@suse.com> [ As reviewed by Miroslav Benes <mbenes@suse.cz>, removed return value of ftrace_run_update_code() as it is a void function. ] Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Robert Beckett 提交于
commit 5aeab2bfc9ffa72d3ca73416635cb3785dfc076f upstream. The event will be sent as part of the vblank enable during the modeset if the crtc is not being kept disabled. Fixes: 5f2f9115 ("drm/imx: atomic phase 3 step 1: Use atomic configuration") Signed-off-by: NRobert Beckett <bob.beckett@collabora.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Robert Beckett 提交于
commit 78c68e8f5cd24bd32ba4ca1cdfb0c30cf0642685 upstream. Notify drm core before sending pending events during crtc disable. This fixes the first event after disable having an old stale timestamp by having drm_crtc_vblank_off update the timestamp to now. This was seen while debugging weston log message: Warning: computed repaint delay is insane: -8212 msec This occurred due to: 1. driver starts up 2. fbcon comes along and restores fbdev, enabling vblank 3. vblank_disable_fn fires via timer disabling vblank, keeping vblank seq number and time set at current value (some time later) 4. weston starts and does a modeset 5. atomic commit disables crtc while it does the modeset 6. ipu_crtc_atomic_disable sends vblank with old seq number and time Fixes: a4744786 ("drm/imx: fix crtc vblank state regression") Signed-off-by: NRobert Beckett <bob.beckett@collabora.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Lucas Stach 提交于
commit be132e1375c1fffe48801296279079f8a59a9ed3 upstream. When something goes wrong in the GPU init after the cmdbuf suballocator has been constructed, we fail to destroy it properly. This causes havok later when the GPU is unbound due to a module unload or similar. Fixes: e66774dd (drm/etnaviv: add cmdbuf suballocator) Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Tested-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Alex Deucher 提交于
commit 25f09f858835b0e9a06213811031190a17d8ab78 upstream. Recommended by the hw team. Reviewed-and-Tested-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Evan Quan 提交于
commit f78c581e22d4b33359ac3462e8d0504735df01f4 upstream. Otherwise, you may get divided-by-zero error or corrput the SMU fan control feature. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Tested-by: NSlava Abramov <slava.abramov@amd.com> Acked-by: NSlava Abramov <slava.abramov@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Ard Biesheuvel 提交于
commit 6f496a555d93db7a11d4860b9220d904822f586a upstream. When KASLR and KASAN are both enabled, we keep the modules where they are, and randomize the placement of the kernel so it is within 2 GB of the module region. The reason for this is that putting modules in the vmalloc region (like we normally do when KASLR is enabled) is not possible in this case, given that the entire vmalloc region is already backed by KASAN zero shadow pages, and so allocating dedicated KASAN shadow space as required by loaded modules is not possible. The default module allocation window is set to [_etext - 128MB, _etext] in kaslr.c, which is appropriate for KASLR kernels booted without a seed or with 'nokaslr' on the command line. However, as it turns out, it is not quite correct for the KASAN case, since it still intersects the vmalloc region at the top, where attempts to allocate shadow pages will collide with the KASAN zero shadow pages, causing a WARN() and all kinds of other trouble. So cap the top end to MODULES_END explicitly when running with KASAN. Cc: <stable@vger.kernel.org> # 4.9+ Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Tested-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Joshua Scott 提交于
commit 80031361747aec92163464f2ee08870fec33bcb0 upstream. Switch to the "marvell,armada-38x-uart" driver variant to empty the UART buffer before writing to the UART_LCR register. Signed-off-by: NJoshua Scott <joshua.scott@alliedtelesis.co.nz> Tested-by: NAndrew Lunn <andrew@lunn.ch> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>. Cc: stable@vger.kernel.org Fixes: 43e28ba8 ("ARM: dts: Use armada-370-xp as a base for armada-xp-98dx3236") Signed-off-by: NGregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eiichi Tsukata 提交于
commit 46cc0b44428d0f0e81f11ea98217fc0edfbeab07 upstream. Current snapshot implementation swaps two ring_buffers even though their sizes are different from each other, that can cause an inconsistency between the contents of buffer_size_kb file and the current buffer size. For example: # cat buffer_size_kb 7 (expanded: 1408) # echo 1 > events/enable # grep bytes per_cpu/cpu0/stats bytes: 1441020 # echo 1 > snapshot // current:1408, spare:1408 # echo 123 > buffer_size_kb // current:123, spare:1408 # echo 1 > snapshot // current:1408, spare:123 # grep bytes per_cpu/cpu0/stats bytes: 1443700 # cat buffer_size_kb 123 // != current:1408 And also, a similar per-cpu case hits the following WARNING: Reproducer: # echo 1 > per_cpu/cpu0/snapshot # echo 123 > buffer_size_kb # echo 1 > per_cpu/cpu0/snapshot WARNING: WARNING: CPU: 0 PID: 1946 at kernel/trace/trace.c:1607 update_max_tr_single.part.0+0x2b8/0x380 Modules linked in: CPU: 0 PID: 1946 Comm: bash Not tainted 5.2.0-rc6 #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 RIP: 0010:update_max_tr_single.part.0+0x2b8/0x380 Code: ff e8 dc da f9 ff 0f 0b e9 88 fe ff ff e8 d0 da f9 ff 44 89 ee bf f5 ff ff ff e8 33 dc f9 ff 41 83 fd f5 74 96 e8 b8 da f9 ff <0f> 0b eb 8d e8 af da f9 ff 0f 0b e9 bf fd ff ff e8 a3 da f9 ff 48 RSP: 0018:ffff888063e4fca0 EFLAGS: 00010093 RAX: ffff888066214380 RBX: ffffffff99850fe0 RCX: ffffffff964298a8 RDX: 0000000000000000 RSI: 00000000fffffff5 RDI: 0000000000000005 RBP: 1ffff1100c7c9f96 R08: ffff888066214380 R09: ffffed100c7c9f9b R10: ffffed100c7c9f9a R11: 0000000000000003 R12: 0000000000000000 R13: 00000000ffffffea R14: ffff888066214380 R15: ffffffff99851060 FS: 00007f9f8173c700(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000714dc0 CR3: 0000000066fa6000 CR4: 00000000000006f0 Call Trace: ? trace_array_printk_buf+0x140/0x140 ? __mutex_lock_slowpath+0x10/0x10 tracing_snapshot_write+0x4c8/0x7f0 ? trace_printk_init_buffers+0x60/0x60 ? selinux_file_permission+0x3b/0x540 ? tracer_preempt_off+0x38/0x506 ? trace_printk_init_buffers+0x60/0x60 __vfs_write+0x81/0x100 vfs_write+0x1e1/0x560 ksys_write+0x126/0x250 ? __ia32_sys_read+0xb0/0xb0 ? do_syscall_64+0x1f/0x390 do_syscall_64+0xc1/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe This patch adds resize_buffer_duplicate_size() to check if there is a difference between current/spare buffer sizes and resize a spare buffer if necessary. Link: http://lkml.kernel.org/r/20190625012910.13109-1-devel@etsukata.com Cc: stable@vger.kernel.org Fixes: ad909e21 ("tracing: Add internal tracing_snapshot() functions") Signed-off-by: NEiichi Tsukata <devel@etsukata.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eric Biggers 提交于
commit cbcfa130a911c613a1d9d921af2eea171c414172 upstream. When IOCB_CMD_POLL is used on a userfaultfd, aio_poll() disables IRQs and takes kioctx::ctx_lock, then userfaultfd_ctx::fd_wqh.lock. This may have to wait for userfaultfd_ctx::fd_wqh.lock to be released by userfaultfd_ctx_read(), which in turn can be waiting for userfaultfd_ctx::fault_pending_wqh.lock or userfaultfd_ctx::event_wqh.lock. But elsewhere the fault_pending_wqh and event_wqh locks are taken with IRQs enabled. Since the IRQ handler may take kioctx::ctx_lock, lockdep reports that a deadlock is possible. Fix it by always disabling IRQs when taking the fault_pending_wqh and event_wqh locks. Commit ae62c16e105a ("userfaultfd: disable irqs when taking the waitqueue lock") didn't fix this because it only accounted for the fd_wqh lock, not the other locks nested inside it. Link: http://lkml.kernel.org/r/20190627075004.21259-1-ebiggers@kernel.org Fixes: bfe4037e ("aio: implement IOCB_CMD_POLL") Signed-off-by: NEric Biggers <ebiggers@google.com> Reported-by: syzbot+fab6de82892b6b9c6191@syzkaller.appspotmail.com Reported-by: syzbot+53c0b767f7ca0dc0c451@syzkaller.appspotmail.com Reported-by: syzbot+a3accb352f9c22041cfa@syzkaller.appspotmail.com Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.19+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Herbert Xu 提交于
commit c8ea9fce2baf7b643384f36f29e4194fa40d33a6 upstream. Sometimes mpi_powm will leak karactx because a memory allocation failure causes a bail-out that skips the freeing of karactx. This patch moves the freeing of karactx to the end of the function like everything else so that it can't be skipped. Reported-by: syzbot+f7baccc38dcc1e094e77@syzkaller.appspotmail.com Fixes: cdec9cb5 ("crypto: GnuPG based MPI lib - source files...") Cc: <stable@vger.kernel.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Reviewed-by: NEric Biggers <ebiggers@kernel.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Dennis Wassenberg 提交于
commit bef33e19203dde434bcdf21c449e3fb4f06c2618 upstream. On M710q Lenovo ThinkCentre machine, there are two front mics, we change the location for one of them to avoid conflicts. Signed-off-by: NDennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Richard Sailer 提交于
commit 503d90b30602a3295978e46d844ccc8167400fe6 upstream. This adds 4 SND_PCI_QUIRK(...) lines for several barebone models of the ODM Clevo. The model names are written in regex syntax to describe/match all clevo models that are similar enough and use the same PCI SSID that this fixup works for them. Additionally the lines regarding SSID 0x96e1 and 0x97e1 didn't fix audio for the all our Clevo notebooks using these SSIDs (models Clevo P960* and P970*) since ALC1220_FIXP_CLEVO_PB51ED_PINS swapped pins that are not necesarry to be swapped. This patch initiates ALC1220_FIXUP_CLEVO_P950 instead for these model and fixes the audio. Fixes: 80690a276f44 ("ALSA: hda/realtek - Add quirk for Tuxedo XC 1509") Signed-off-by: NRichard Sailer <rs@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Colin Ian King 提交于
commit 2acf5a3e6e9371e63c9e4ff54d84d08f630467a0 upstream. There are a couple of left shifts of unsigned 8 bit values that first get promoted to signed ints and hence get sign extended on the shift if the top bit of the 8 bit values are set. Fix this by casting the 8 bit values to unsigned ints to stop the unintentional sign extension. Addresses-Coverity: ("Unintended sign extension") Signed-off-by: NColin Ian King <colin.king@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Takashi Iwai 提交于
commit 3450121997ce872eb7f1248417225827ea249710 upstream. LINE6 drivers allocate the buffers based on the value returned from usb_maxpacket() calls. The manipulated device may return zero for this, and this results in the kmalloc() with zero size (and it may succeed) while the other part of the driver code writes the packet data with the fixed size -- which eventually overwrites. This patch adds a simple sanity check for the invalid buffer size for avoiding that problem. Reported-by: syzbot+219f00fb49874dcaea17@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Takashi Sakamoto 提交于
commit 7fbd1753b64eafe21cf842348a40a691d0dee440 upstream. In IEC 61883-6, 8 MIDI data streams are multiplexed into single MIDI conformant data channel. The index of stream is calculated by modulo 8 of the value of data block counter. In fireworks, the value of data block counter in CIP header has a quirk with firmware version v5.0.0, v5.7.3 and v5.8.0. This brings ALSA IEC 61883-1/6 packet streaming engine to miss detection of MIDI messages. This commit fixes the miss detection to modify the value of data block counter for the modulo calculation. For maintainers, this bug exists since a commit 18f5ed36 ("ALSA: fireworks/firewire-lib: add support for recent firmware quirk") in Linux kernel v4.2. There're many changes since the commit. This fix can be backported to Linux kernel v4.4 or later. I tagged a base commit to the backport for your convenience. Besides, my work for Linux kernel v5.3 brings heavy code refactoring and some structure members are renamed in 'sound/firewire/amdtp-stream.h'. The content of this patch brings conflict when merging -rc tree with this patch and the latest tree. I request maintainers to solve the conflict to replace 'tx_first_dbc' with 'ctx_data.tx.first_dbc'. Fixes: df075fee ("ALSA: firewire-lib: complete AM824 data block processing layer") Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Colin Ian King 提交于
commit c3ea60c231446663afd6ea1054da6b7f830855ca upstream. There are two occurrances of a call to snd_seq_oss_fill_addr where the dest_client and dest_port arguments are in the wrong order. Fix this by swapping them around. Addresses-Coverity: ("Arguments in wrong order") Signed-off-by: NColin Ian King <colin.king@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-