- 19 10月, 2021 40 次提交
-
-
由 Eric Biggers 提交于
stable inclusion from stable-5.10.63 commit b8c298cf57dcb5b18855f11437199fd0eb1ea388 bugzilla: 182231 https://gitee.com/openeuler/kernel/issues/I4EFS1 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b8c298cf57dcb5b18855f11437199fd0eb1ea388 -------------------------------- commit d1876056 upstream. Add a helper function fscrypt_symlink_getattr() which will be called from the various filesystems' ->getattr() methods to read and decrypt the target of encrypted symlinks in order to report the correct st_size. Detailed explanation: As required by POSIX and as documented in various man pages, st_size for a symlink is supposed to be the length of the symlink target. Unfortunately, st_size has always been wrong for encrypted symlinks because st_size is populated from i_size from disk, which intentionally contains the length of the encrypted symlink target. That's slightly greater than the length of the decrypted symlink target (which is the symlink target that userspace usually sees), and usually won't match the length of the no-key encoded symlink target either. This hadn't been fixed yet because reporting the correct st_size would require reading the symlink target from disk and decrypting or encoding it, which historically has been considered too heavyweight to do in ->getattr(). Also historically, the wrong st_size had only broken a test (LTP lstat03) and there were no known complaints from real users. (This is probably because the st_size of symlinks isn't used too often, and when it is, typically it's for a hint for what buffer size to pass to readlink() -- which a slightly-too-large size still works for.) However, a couple things have changed now. First, there have recently been complaints about the current behavior from real users: - Breakage in rpmbuild: https://github.com/rpm-software-management/rpm/issues/1682 https://github.com/google/fscrypt/issues/305 - Breakage in toybox cpio: https://www.mail-archive.com/toybox@lists.landley.net/msg07193.html - Breakage in libgit2: https://issuetracker.google.com/issues/189629152 (on Android public issue tracker, requires login) Second, we now cache decrypted symlink targets in ->i_link. Therefore, taking the performance hit of reading and decrypting the symlink target in ->getattr() wouldn't be as big a deal as it used to be, since usually it will just save having to do the same thing later. Also note that eCryptfs ended up having to read and decrypt symlink targets in ->getattr() as well, to fix this same issue; see commit 3a60a168 ("eCryptfs: Decrypt symlink target for stat size"). So, let's just bite the bullet, and read and decrypt the symlink target in ->getattr() in order to report the correct st_size. Add a function fscrypt_symlink_getattr() which the filesystems will call to do this. (Alternatively, we could store the decrypted size of symlinks on-disk. But there isn't a great place to do so, and encryption is meant to hide the original size to some extent; that property would be lost.) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210702065350.209646-2-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yonghong Song 提交于
mainline inclusion from mainline-5.10.62 commit 0c9a876f2897c64d21a5a414a9007a10cd015a9e bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0c9a876f2897c64d21a5a414a9007a10cd015a9e -------------------------------- commit a2baf4e8 upstream. Commit b910eaaa ("bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper") fixed a bug for bpf_get_local_storage() helper so different tasks won't mess up with each other's percpu local storage. The percpu data contains 8 slots so it can hold up to 8 contexts (same or different tasks), for 8 different program runs, at the same time. This in general is sufficient. But our internal testing showed the following warning multiple times: [...] warning: WARNING: CPU: 13 PID: 41661 at include/linux/bpf-cgroup.h:193 __cgroup_bpf_run_filter_sock_ops+0x13e/0x180 RIP: 0010:__cgroup_bpf_run_filter_sock_ops+0x13e/0x180 <IRQ> tcp_call_bpf.constprop.99+0x93/0xc0 tcp_conn_request+0x41e/0xa50 ? tcp_rcv_state_process+0x203/0xe00 tcp_rcv_state_process+0x203/0xe00 ? sk_filter_trim_cap+0xbc/0x210 ? tcp_v6_inbound_md5_hash.constprop.41+0x44/0x160 tcp_v6_do_rcv+0x181/0x3e0 tcp_v6_rcv+0xc65/0xcb0 ip6_protocol_deliver_rcu+0xbd/0x450 ip6_input_finish+0x11/0x20 ip6_input+0xb5/0xc0 ip6_sublist_rcv_finish+0x37/0x50 ip6_sublist_rcv+0x1dc/0x270 ipv6_list_rcv+0x113/0x140 __netif_receive_skb_list_core+0x1a0/0x210 netif_receive_skb_list_internal+0x186/0x2a0 gro_normal_list.part.170+0x19/0x40 napi_complete_done+0x65/0x150 mlx5e_napi_poll+0x1ae/0x680 __napi_poll+0x25/0x120 net_rx_action+0x11e/0x280 __do_softirq+0xbb/0x271 irq_exit_rcu+0x97/0xa0 common_interrupt+0x7f/0xa0 </IRQ> asm_common_interrupt+0x1e/0x40 RIP: 0010:bpf_prog_1835a9241238291a_tw_egress+0x5/0xbac ? __cgroup_bpf_run_filter_skb+0x378/0x4e0 ? do_softirq+0x34/0x70 ? ip6_finish_output2+0x266/0x590 ? ip6_finish_output+0x66/0xa0 ? ip6_output+0x6c/0x130 ? ip6_xmit+0x279/0x550 ? ip6_dst_check+0x61/0xd0 [...] Using drgn [0] to dump the percpu buffer contents showed that on this CPU slot 0 is still available, but slots 1-7 are occupied and those tasks in slots 1-7 mostly don't exist any more. So we might have issues in bpf_cgroup_storage_unset(). Further debugging confirmed that there is a bug in bpf_cgroup_storage_unset(). Currently, it tries to unset "current" slot with searching from the start. So the following sequence is possible: 1. A task is running and claims slot 0 2. Running BPF program is done, and it checked slot 0 has the "task" and ready to reset it to NULL (not yet). 3. An interrupt happens, another BPF program runs and it claims slot 1 with the *same* task. 4. The unset() in interrupt context releases slot 0 since it matches "task". 5. Interrupt is done, the task in process context reset slot 0. At the end, slot 1 is not reset and the same process can continue to occupy slots 2-7 and finally, when the above step 1-5 is repeated again, step 3 BPF program won't be able to claim an empty slot and a warning will be issued. To fix the issue, for unset() function, we should traverse from the last slot to the first. This way, the above issue can be avoided. The same reverse traversal should also be done in bpf_get_local_storage() helper itself. Otherwise, incorrect local storage may be returned to BPF program. [0] https://github.com/osandov/drgn Fixes: b910eaaa ("bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper") Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210810010413.1976277-1-yhs@fb.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Richard Guy Briggs 提交于
mainline inclusion from mainline-5.10.62 commit 38c1915d3e9f457006c4b2ef5eaab038c34a95a2 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=38c1915d3e9f457006c4b2ef5eaab038c34a95a2 -------------------------------- commit 67d69e9d upstream. AUDIT_TRIM is expected to be idempotent, but multiple executions resulted in a refcount underflow and use-after-free. git bisect fingered commit fb041bb7 ("locking/refcount: Consolidate implementations of refcount_t") but this patch with its more thorough checking that wasn't in the x86 assembly code merely exposed a previously existing tree refcount imbalance in the case of tree trimming code that was refactored with prune_one() to remove a tree introduced in commit 8432c700 ("audit: Simplify locking around untag_chunk()") Move the put_tree() to cover only the prune_one() case. Passes audit-testsuite and 3 passes of "auditctl -t" with at least one directory watch. Cc: Jan Kara <jack@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Seiji Nishikawa <snishika@redhat.com> Cc: stable@vger.kernel.org Fixes: 8432c700 ("audit: Simplify locking around untag_chunk()") Signed-off-by: NRichard Guy Briggs <rgb@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> [PM: reformatted/cleaned-up the commit description] Signed-off-by: NPaul Moore <paul@paul-moore.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Collingbourne 提交于
mainline inclusion from mainline-5.10.62 commit 1890ee7ff87fc48806c37f3543e025e2252ac2e3 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1890ee7ff87fc48806c37f3543e025e2252ac2e3 -------------------------------- commit d0efb162 upstream. A common implementation of isatty(3) involves calling a ioctl passing a dummy struct argument and checking whether the syscall failed -- bionic and glibc use TCGETS (passing a struct termios), and musl uses TIOCGWINSZ (passing a struct winsize). If the FD is a socket, we will copy sizeof(struct ifreq) bytes of data from the argument and return -EFAULT if that fails. The result is that the isatty implementations may return a non-POSIX-compliant value in errno in the case where part of the dummy struct argument is inaccessible, as both struct termios and struct winsize are smaller than struct ifreq (at least on arm64). Although there is usually enough stack space following the argument on the stack that this did not present a practical problem up to now, with MTE stack instrumentation it's more likely for the copy to fail, as the memory following the struct may have a different tag. Fix the problem by adding an early check for whether the ioctl is a valid socket ioctl, and return -ENOTTY if it isn't. Fixes: 44c02a2c ("dev_ioctl(): move copyin/copyout to callers") Link: https://linux-review.googlesource.com/id/I869da6cf6daabc3e4b7b82ac979683ba05e27d4dSigned-off-by: NPeter Collingbourne <pcc@google.com> Cc: <stable@vger.kernel.org> # 4.19 Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Helge Deller 提交于
mainline inclusion from mainline-5.10.62 commit 0085646e02b2423e20d0a9ea3fe64d8270255b23 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0085646e02b2423e20d0a9ea3fe64d8270255b23 -------------------------------- commit f6a3308d upstream. This reverts commit 83af58f8. It turns out that at least the assembly implementation for strncpy() was buggy. Revert the whole commit and return back to the default coding. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v5.4+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Denis Efremov 提交于
mainline inclusion from mainline-5.10.62 commit 17982c664f8b2c9543241552331b008ad34d2648 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=17982c664f8b2c9543241552331b008ad34d2648 -------------------------------- commit c7e9d002 upstream. The patch breaks userspace implementations (e.g. fdutils) and introduces regressions in behaviour. Previously, it was possible to O_NDELAY open a floppy device with no media inserted or with write protected media without an error. Some userspace tools use this particular behavior for probing. It's not the first time when we revert this patch. Previous revert is in commit f2791e7e (Revert "floppy: refactor open() flags handling"). This reverts commit 8a0c014c. Link: https://lore.kernel.org/linux-block/de10cb47-34d1-5a88-7751-225ca380f735@compro.net/Reported-by: NMark Hounschell <markh@compro.net> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Wim Osterholt <wim@djo.tudelft.nl> Cc: Kurt Garloff <kurt@garloff.de> Cc: <stable@vger.kernel.org> Signed-off-by: NDenis Efremov <efremov@linux.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Petr Vorel 提交于
mainline inclusion from mainline-5.10.62 commit 1604c42a1ca93469dc5c726d7768fdc97105e87f bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1604c42a1ca93469dc5c726d7768fdc97105e87f -------------------------------- commit f890f89d upstream. Reserve GPIO pins 85-88 as these aren't meant to be accessible from the application CPUs (causes reboot). Yet another fix similar to 91345867, 5f8d3ab1, which is needed to allow angler to boot after 3edfb7bd ("gpiolib: Show correct direction from the beginning"). Fixes: feeaf56a ("arm64: dts: msm8994 SoC and Huawei Angler (Nexus 6P) support") Signed-off-by: NPetr Vorel <petr.vorel@gmail.com> Reviewed-by: NKonrad Dybcio <konrad.dybcio@somainline.org> Link: https://lore.kernel.org/r/20210415193913.1836153-1-petr.vorel@gmail.comSigned-off-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kees Cook 提交于
mainline inclusion from mainline-5.10.62 commit f760c1101f5284acdf3a8132dff257838b9bf75c bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f760c1101f5284acdf3a8132dff257838b9bf75c -------------------------------- commit f123c42b upstream Where feasible, I prefer to have all tests visible on all architectures, but to have them wired to XFAIL. DOUBLE_FAIL was set up to XFAIL, but wasn't actually being added to the test list. Fixes: cea23efb ("lkdtm/bugs: Make double-fault test always available") Cc: stable@vger.kernel.org Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210623203936.3151093-7-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> [sudip: adjust context] Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 DENG Qingfang 提交于
mainline inclusion from mainline-5.10.62 commit b6c657abb893ef456fad6b229089f37127202df0 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b6c657abb893ef456fad6b229089f37127202df0 -------------------------------- commit 7428022b upstream. When a port leaves a VLAN-aware bridge, the current code does not clear other ports' matrix field bit. If the bridge is later set to VLAN-unaware mode, traffic in the bridge may leak to that port. Remove the VLAN filtering check in mt7530_port_bridge_leave. Fixes: 474a2dda ("net: dsa: mt7530: fix VLAN traffic leaks") Fixes: 83163f7d ("net: dsa: mediatek: add VLAN support for MT7530") Signed-off-by: NDENG Qingfang <dqfext@gmail.com> Reviewed-by: NVladimir Oltean <olteanv@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Bjorn Andersson 提交于
mainline inclusion from mainline-5.10.62 commit f8242f554c82ad7f91b73d3834e81335af9ad600 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f8242f554c82ad7f91b73d3834e81335af9ad600 -------------------------------- commit 8c9b3caa upstream. It's possible that the interrupt handler for the UCSI driver signals a connector changes after the handler clears the PENDING bit, but before it has sent the acknowledge request. The result is that the handler is invoked yet again, to ack the same connector change. At least some versions of the Qualcomm UCSI firmware will not handle the second - "spurious" - acknowledgment gracefully. So make sure to not clear the pending flag until the change is acknowledged. Any connector changes coming in after the acknowledgment, that would have the pending flag incorrectly cleared, would afaict be covered by the subsequent connector status check. Fixes: 217504a0 ("usb: typec: ucsi: Work around PPM losing change information") Cc: stable <stable@vger.kernel.org> Reviewed-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com> Acked-By: NBenjamin Berg <bberg@redhat.com> Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210516040953.622409-1-bjorn.andersson@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Benjamin Berg 提交于
mainline inclusion from mainline-5.10.62 commit e15e32d519fa9e0fb157fafeedb603283fef26e8 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e15e32d519fa9e0fb157fafeedb603283fef26e8 -------------------------------- commit 217504a0 upstream. Some/many PPMs are simply clearing the change bitfield when a notification on a port is acknowledge. Unfortunately, doing so means that any changes between the GET_CONNECTOR_STATUS and ACK_CC_CI commands is simply lost. Work around this by re-fetching the connector status afterwards. We can then infer any changes that we see have happened but that may not be respresented in the change bitfield. We end up with the following actions: 1. UCSI_GET_CONNECTOR_STATUS, store result, update unprocessed_changes 2. UCSI_GET_CAM_SUPPORTED, discard result 3. ACK connector change 4. UCSI_GET_CONNECTOR_STATUS, store result 5. Infere lost changes by comparing UCSI_GET_CONNECTOR_STATUS results 6. If PPM reported a new change, then restart in order to ACK 7. Process everything as usual. The worker is also changed to re-schedule itself if a new change notification happened while it was running. Doing this fixes quite commonly occurring issues where e.g. the UCSI power supply would remain online even thought the ThunderBolt cable was unplugged. Cc: Hans de Goede <hdegoede@redhat.com> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: NBenjamin Berg <bberg@redhat.com> Link: https://lore.kernel.org/r/20201009144047.505957-3-benjamin@sipsolutions.netSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Benjamin Berg 提交于
mainline inclusion from mainline-5.10.62 commit 08953884aad457dca2670b94e10077a506f3be3d bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08953884aad457dca2670b94e10077a506f3be3d -------------------------------- commit 47ea2929 upstream. Normal commands may be reporting that a connector has changed. Always call the usci_connector_change handler and let it take care of scheduling the work when needed. Doing this makes the ACPI code path identical to the CCG one. Cc: Hans de Goede <hdegoede@redhat.com> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: NBenjamin Berg <bberg@redhat.com> Link: https://lore.kernel.org/r/20201009144047.505957-2-benjamin@sipsolutions.netSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mathieu Desnoyers 提交于
mainline inclusion from mainline-5.10.62 commit 9a4f1dc8a17c8606ab0506cce63e554253677a53 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9a4f1dc8a17c8606ab0506cce63e554253677a53 -------------------------------- commit 7b40066c upstream. State transitions from 1->0->1 and N->2->1 callbacks require RCU synchronization. Rather than performing the RCU synchronization every time the state change occurs, which is quite slow when many tracepoints are registered in batch, instead keep a snapshot of the RCU state on the most recent transitions which belong to a chain, and conditionally wait for a grace period on the last transition of the chain if one g.p. has not elapsed since the last snapshot. This applies to both RCU and SRCU. This brings the performance regression caused by commit 231264d6 ("Fix: tracepoint: static call function vs data state mismatch") back to what it was originally. Before this commit: # trace-cmd start -e all # time trace-cmd start -p nop real 0m10.593s user 0m0.017s sys 0m0.259s After this commit: # trace-cmd start -e all # time trace-cmd start -p nop real 0m0.878s user 0m0.000s sys 0m0.103s Link: https://lkml.kernel.org/r/20210805192954.30688-1-mathieu.desnoyers@efficios.com Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/ Cc: stable@vger.kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Stefan Metzmacher <metze@samba.org> Fixes: 231264d6 ("Fix: tracepoint: static call function vs data state mismatch") Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paul E. McKenney 提交于
mainline inclusion from mainline-5.10.62 commit b6ae3854075e67a2764e30447f8603ef964aecc5 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b6ae3854075e67a2764e30447f8603ef964aecc5 -------------------------------- commit 8b5bd67c upstream. There is a need for a polling interface for SRCU grace periods, so this commit supplies get_state_synchronize_srcu(), start_poll_synchronize_srcu(), and poll_state_synchronize_srcu() for this purpose. The first can be used if future grace periods are inevitable (perhaps due to a later call_srcu() invocation), the second if future grace periods might not otherwise happen, and the third to check if a grace period has elapsed since the corresponding call to either of the first two. As with get_state_synchronize_rcu() and cond_synchronize_rcu(), the return value from either get_state_synchronize_srcu() or start_poll_synchronize_srcu() must be passed in to a later call to poll_state_synchronize_srcu(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/Reported-by: NKent Overstreet <kent.overstreet@gmail.com> [ paulmck: Add EXPORT_SYMBOL_GPL() per kernel test robot feedback. ] [ paulmck: Apply feedback from Neeraj Upadhyay. ] Link: https://lore.kernel.org/lkml/20201117004017.GA7444@paulmck-ThinkPad-P72/Reviewed-by: NNeeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paul E. McKenney 提交于
mainline inclusion from mainline-5.10.62 commit 450948b06ce8ba3e22df094ef324dbf5fa52f050 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=450948b06ce8ba3e22df094ef324dbf5fa52f050 -------------------------------- commit 74612a07 upstream. There is a need for a polling interface for SRCU grace periods. This polling needs to distinguish between an SRCU instance being idle on the one hand or in the middle of a grace period on the other. This commit therefore converts the Tiny SRCU srcu_struct structure's srcu_idx from a defacto boolean to a free-running counter, using the bottom bit to indicate that a grace period is in progress. The second-from-bottom bit is thus used as the index returned by srcu_read_lock(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/Reported-by: NKent Overstreet <kent.overstreet@gmail.com> [ paulmck: Fix ->srcu_lock_nesting[] indexing per Neeraj Upadhyay. ] Reviewed-by: NNeeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paul E. McKenney 提交于
mainline inclusion from mainline-5.10.62 commit 641e1d88404a57983c75cc39b9feb076f5cdeac8 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=641e1d88404a57983c75cc39b9feb076f5cdeac8 -------------------------------- commit 1a893c71 upstream. There is a need for a polling interface for SRCU grace periods. This polling needs to initiate an SRCU grace period without having to queue (and manage) a callback. This commit therefore splits the Tiny SRCU call_srcu() function into callback-queuing and start-grace-period portions, with the latter in a new function named srcu_gp_start_if_needed(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/Reported-by: NKent Overstreet <kent.overstreet@gmail.com> Reviewed-by: NNeeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paul E. McKenney 提交于
mainline inclusion from mainline-5.10.62 commit f789de3be808a0492a09cd7ad44cb017203572c6 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f789de3be808a0492a09cd7ad44cb017203572c6 -------------------------------- commit 5358c9fa upstream. There is a need for a polling interface for SRCU grace periods, so this commit supplies get_state_synchronize_srcu(), start_poll_synchronize_srcu(), and poll_state_synchronize_srcu() for this purpose. The first can be used if future grace periods are inevitable (perhaps due to a later call_srcu() invocation), the second if future grace periods might not otherwise happen, and the third to check if a grace period has elapsed since the corresponding call to either of the first two. As with get_state_synchronize_rcu() and cond_synchronize_rcu(), the return value from either get_state_synchronize_srcu() or start_poll_synchronize_srcu() must be passed in to a later call to poll_state_synchronize_srcu(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/Reported-by: NKent Overstreet <kent.overstreet@gmail.com> [ paulmck: Add EXPORT_SYMBOL_GPL() per kernel test robot feedback. ] [ paulmck: Apply feedback from Neeraj Upadhyay. ] Link: https://lore.kernel.org/lkml/20201117004017.GA7444@paulmck-ThinkPad-P72/Reviewed-by: NNeeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paul E. McKenney 提交于
mainline inclusion from mainline-5.10.62 commit fdf66e5a7fc87d3213c21eef4472b71d54fdf736 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fdf66e5a7fc87d3213c21eef4472b71d54fdf736 -------------------------------- commit 29d2bb94 upstream. There is a need for a polling interface for SRCU grace periods. This polling needs to initiate an SRCU grace period without having to queue (and manage) a callback. This commit therefore splits the Tree SRCU __call_srcu() function into callback-initialization and queuing/start-grace-period portions, with the latter in a new function named srcu_gp_start_if_needed(). This function may be passed a NULL callback pointer, in which case it will refrain from queuing anything. Why have the new function mess with queuing? Locking considerations, of course! Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/Reported-by: NKent Overstreet <kent.overstreet@gmail.com> Reviewed-by: NNeeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guo Ren 提交于
mainline inclusion from mainline-5.10.62 commit 133d7f93eecd23b003fc2dc58302ec7de723cf72 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=133d7f93eecd23b003fc2dc58302ec7de723cf72 -------------------------------- commit 5ad84adf upstream. Just like arm64, we can't trace the function in the patch_text path. Here is the bug log: [ 45.234334] Unable to handle kernel paging request at virtual address ffffffd38ae80900 [ 45.242313] Oops [#1] [ 45.244600] Modules linked in: [ 45.247678] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.9.0-00025-g9b7db83-dirty #215 [ 45.255797] epc: ffffffe00021689a ra : ffffffe00021718e sp : ffffffe01afabb58 [ 45.262955] gp : ffffffe00136afa0 tp : ffffffe01af94d00 t0 : 0000000000000002 [ 45.270200] t1 : 0000000000000000 t2 : 0000000000000001 s0 : ffffffe01afabc08 [ 45.277443] s1 : ffffffe0013718a8 a0 : 0000000000000000 a1 : ffffffe01afabba8 [ 45.284686] a2 : 0000000000000000 a3 : 0000000000000000 a4 : c4c16ad38ae80900 [ 45.291929] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000052464e43 [ 45.299173] s2 : 0000000000000001 s3 : ffffffe000206a60 s4 : ffffffe000206a60 [ 45.306415] s5 : 00000000000009ec s6 : ffffffe0013718a8 s7 : c4c16ad38ae80900 [ 45.313658] s8 : 0000000000000004 s9 : 0000000000000001 s10: 0000000000000001 [ 45.320902] s11: 0000000000000003 t3 : 0000000000000001 t4 : ffffffffd192fe79 [ 45.328144] t5 : ffffffffb8f80000 t6 : 0000000000040000 [ 45.333472] status: 0000000200000100 badaddr: ffffffd38ae80900 cause: 000000000000000f [ 45.341514] ---[ end trace d95102172248fdcf ]--- [ 45.346176] note: migration/0[11] exited with preempt_count 1 (gdb) x /2i $pc => 0xffffffe00021689a <__do_proc_dointvec+196>: sd zero,0(s7) 0xffffffe00021689e <__do_proc_dointvec+200>: li s11,0 (gdb) bt 0 __do_proc_dointvec (tbl_data=0x0, table=0xffffffe01afabba8, write=0, buffer=0x0, lenp=0x7bf897061f9a0800, ppos=0x4, conv=0x0, data=0x52464e43) at kernel/sysctl.c:581 1 0xffffffe00021718e in do_proc_dointvec (data=<optimized out>, conv=<optimized out>, ppos=<optimized out>, lenp=<optimized out>, buffer=<optimized out>, write=<optimized out>, table=<optimized out>) at kernel/sysctl.c:964 2 proc_dointvec_minmax (ppos=<optimized out>, lenp=<optimized out>, buffer=<optimized out>, write=<optimized out>, table=<optimized out>) at kernel/sysctl.c:964 3 proc_do_static_key (table=<optimized out>, write=1, buffer=0x0, lenp=0x0, ppos=0x7bf897061f9a0800) at kernel/sysctl.c:1643 4 0xffffffe000206792 in ftrace_make_call (rec=<optimized out>, addr=<optimized out>) at arch/riscv/kernel/ftrace.c:109 5 0xffffffe0002c9c04 in __ftrace_replace_code (rec=0xffffffe01ae40c30, enable=3) at kernel/trace/ftrace.c:2503 6 0xffffffe0002ca0b2 in ftrace_replace_code (mod_flags=<optimized out>) at kernel/trace/ftrace.c:2530 7 0xffffffe0002ca26a in ftrace_modify_all_code (command=5) at kernel/trace/ftrace.c:2677 8 0xffffffe0002ca30e in __ftrace_modify_code (data=<optimized out>) at kernel/trace/ftrace.c:2703 9 0xffffffe0002c13b0 in multi_cpu_stop (data=0x0) at kernel/stop_machine.c:224 10 0xffffffe0002c0fde in cpu_stopper_thread (cpu=<optimized out>) at kernel/stop_machine.c:491 11 0xffffffe0002343de in smpboot_thread_fn (data=0x0) at kernel/smpboot.c:165 12 0xffffffe00022f8b4 in kthread (_create=0xffffffe01af0c040) at kernel/kthread.c:292 13 0xffffffe000201fac in handle_exception () at arch/riscv/kernel/entry.S:236 0xffffffe00020678a <+114>: auipc ra,0xffffe 0xffffffe00020678e <+118>: jalr -118(ra) # 0xffffffe000204714 <patch_text_nosync> 0xffffffe000206792 <+122>: snez a0,a0 (gdb) disassemble patch_text_nosync Dump of assembler code for function patch_text_nosync: 0xffffffe000204714 <+0>: addi sp,sp,-32 0xffffffe000204716 <+2>: sd s0,16(sp) 0xffffffe000204718 <+4>: sd ra,24(sp) 0xffffffe00020471a <+6>: addi s0,sp,32 0xffffffe00020471c <+8>: auipc ra,0x0 0xffffffe000204720 <+12>: jalr -384(ra) # 0xffffffe00020459c <patch_insn_write> 0xffffffe000204724 <+16>: beqz a0,0xffffffe00020472e <patch_text_nosync+26> 0xffffffe000204726 <+18>: ld ra,24(sp) 0xffffffe000204728 <+20>: ld s0,16(sp) 0xffffffe00020472a <+22>: addi sp,sp,32 0xffffffe00020472c <+24>: ret 0xffffffe00020472e <+26>: sd a0,-24(s0) 0xffffffe000204732 <+30>: auipc ra,0x4 0xffffffe000204736 <+34>: jalr -1464(ra) # 0xffffffe00020817a <flush_icache_all> 0xffffffe00020473a <+38>: ld a0,-24(s0) 0xffffffe00020473e <+42>: ld ra,24(sp) 0xffffffe000204740 <+44>: ld s0,16(sp) 0xffffffe000204742 <+46>: addi sp,sp,32 0xffffffe000204744 <+48>: ret (gdb) disassemble flush_icache_all-4 Dump of assembler code for function flush_icache_all: 0xffffffe00020817a <+0>: addi sp,sp,-8 0xffffffe00020817c <+2>: sd ra,0(sp) 0xffffffe00020817e <+4>: auipc ra,0xfffff 0xffffffe000208182 <+8>: jalr -1822(ra) # 0xffffffe000206a60 <ftrace_caller> 0xffffffe000208186 <+12>: ld ra,0(sp) 0xffffffe000208188 <+14>: addi sp,sp,8 0xffffffe00020818a <+0>: addi sp,sp,-16 0xffffffe00020818c <+2>: sd s0,0(sp) 0xffffffe00020818e <+4>: sd ra,8(sp) 0xffffffe000208190 <+6>: addi s0,sp,16 0xffffffe000208192 <+8>: li a0,0 0xffffffe000208194 <+10>: auipc ra,0xfffff 0xffffffe000208198 <+14>: jalr -410(ra) # 0xffffffe000206ffa <sbi_remote_fence_i> 0xffffffe00020819c <+18>: ld s0,0(sp) 0xffffffe00020819e <+20>: ld ra,8(sp) 0xffffffe0002081a0 <+22>: addi sp,sp,16 0xffffffe0002081a2 <+24>: ret (gdb) frame 5 (rec=0xffffffe01ae40c30, enable=3) at kernel/trace/ftrace.c:2503 2503 return ftrace_make_call(rec, ftrace_addr); (gdb) p /x rec->ip $2 = 0xffffffe00020817a -> flush_icache_all ! When we modified flush_icache_all's patchable-entry with ftrace_caller: - Insert ftrace_caller at flush_icache_all prologue. - Call flush_icache_all to sync I/Dcache, but flush_icache_all is just we modified by half. Link: https://lore.kernel.org/linux-riscv/CAJF2gTT=oDWesWe0JVWvTpGi60-gpbNhYLdFWN_5EbyeqoEDdw@mail.gmail.com/T/#tSigned-off-by: NGuo Ren <guoren@linux.alibaba.com> Reviewed-by: NAtish Patra <atish.patra@wdc.com> Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: NDimitri John Ledkov <dimitri.ledkov@canonical.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guo Ren 提交于
mainline inclusion from mainline-5.10.62 commit 7e2087249e87b0fceb53f428cbf872b9d2db44ad bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e2087249e87b0fceb53f428cbf872b9d2db44ad -------------------------------- commit 67d94577 upstream. We must use $(CC_FLAGS_FTRACE) instead of directly using -pg. It will cause -fpatchable-function-entry error. Signed-off-by: NGuo Ren <guoren@linux.alibaba.com> Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: NDimitri John Ledkov <dimitri.ledkov@canonical.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Pauli Virtanen 提交于
mainline inclusion from mainline-5.10.62 commit b42fde92cddeb0c7326877f1832d61816c9a2aa9 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b42fde92cddeb0c7326877f1832d61816c9a2aa9 -------------------------------- commit 55981d35 upstream. Some USB BT adapters don't satisfy the MTU requirement mentioned in commit e848dbd3 ("Bluetooth: btusb: Add support USB ALT 3 for WBS") and have ALT 3 setting that produces no/garbled audio. Some adapters with larger MTU were also reported to have problems with ALT 3. Add a flag and check it and MTU before selecting ALT 3, falling back to ALT 1. Enable the flag for Realtek, restoring the previous behavior for non-Realtek devices. Tested with USB adapters (mtu<72, no/garbled sound with ALT3, ALT1 works) BCM20702A1 0b05:17cb, CSR8510A10 0a12:0001, and (mtu>=72, ALT3 works) RTL8761BU 0bda:8771, Intel AX200 8087:0029 (after disabling ALT6). Also got reports for (mtu>=72, ALT 3 reported to produce bad audio) Intel 8087:0a2b. Signed-off-by: NPauli Virtanen <pav@iki.fi> Fixes: e848dbd3 ("Bluetooth: btusb: Add support USB ALT 3 for WBS") Tested-by: NMichał Kępień <kernel@kempniu.pl> Tested-by: NJonathan Lampérth <jon@h4n.dev> Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Xin Long 提交于
mainline inclusion from mainline-5.10.62 commit 0a178a01516158caeb3aa26c1b54d50ad12333f6 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0a178a01516158caeb3aa26c1b54d50ad12333f6 -------------------------------- commit 7387a72c upstream. __tipc_sendmsg() is called to send SYN packet by either tipc_sendmsg() or tipc_connect(). The difference is in tipc_connect(), it will call tipc_wait_for_connect() after __tipc_sendmsg() to wait until connecting is done. So there's no need to wait in __tipc_sendmsg() for this case. This patch is to fix it by calling tipc_wait_for_connect() only when dlen is not 0 in __tipc_sendmsg(), which means it's called by tipc_connect(). Note this also fixes the failure in tipcutils/test/ptts/: # ./tipcTS & # ./tipcTC 9 (hang) Fixes: 36239dab6da7 ("tipc: fix implicit-connect for SYN+") Reported-by: NShuang Li <shuali@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NJon Maloy <jmaloy@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Frieder Schrempf 提交于
mainline inclusion from mainline-5.10.62 commit ded6da217ced636f757051baa98a2564d7ec8100 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ded6da217ced636f757051baa98a2564d7ec8100 -------------------------------- The new generic NAND ECC framework stores the configuration and requirements in separate places since commit 93ef92f6 ("mtd: nand: Use the new generic ECC object"). In 5.10.x The SPI NAND layer still uses only the requirements to track the ECC properties. This mismatch leads to values of zero being used for ECC strength and step_size in the SPI NAND layer wherever nanddev_get_ecc_conf() is used and therefore breaks the SPI NAND on-die ECC support in 5.10.x. By using nanddev_get_ecc_requirements() instead of nanddev_get_ecc_conf() for SPI NAND, we make sure that the correct parameters for the detected chip are used. In later versions (5.11.x) this is fixed anyway with the implementation of the SPI NAND on-die ECC engine. Cc: stable@vger.kernel.org # 5.10.x Reported-by: Nvoice INTER connect GmbH <developer@voiceinterconnect.de> Signed-off-by: NFrieder Schrempf <frieder.schrempf@kontron.de> Acked-by: NMiquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Linus Torvalds 提交于
mainline inclusion from mainline-5.10.62 commit 3b2018f9c9c088741d7d33a2baf9aa39e93d58c5 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3b2018f9c9c088741d7d33a2baf9aa39e93d58c5 -------------------------------- commit fe67f4dd upstream. It turns out that the SIGIO/FASYNC situation is almost exactly the same as the EPOLLET case was: user space really wants to be notified after every operation. Now, in a perfect world it should be sufficient to only notify user space on "state transitions" when the IO state changes (ie when a pipe goes from unreadable to readable, or from unwritable to writable). User space should then do as much as possible - fully emptying the buffer or what not - and we'll notify it again the next time the state changes. But as with EPOLLET, we have at least one case (stress-ng) where the kernel sent SIGIO due to the pipe being marked for asynchronous notification, but the user space signal handler then didn't actually necessarily read it all before returning (it read more than what was written, but since there could be multiple writes, it could leave data pending). The user space code then expected to get another SIGIO for subsequent writes - even though the pipe had been readable the whole time - and would only then read more. This is arguably a user space bug - and Colin King already fixed the stress-ng code in question - but the kernel regression rules are clear: it doesn't matter if kernel people think that user space did something silly and wrong. What matters is that it used to work. So if user space depends on specific historical kernel behavior, it's a regression when that behavior changes. It's on us: we were silly to have that non-optimal historical behavior, and our old kernel behavior was what user space was tested against. Because of how the FASYNC notification was tied to wakeup behavior, this was first broken by commits f467a6a6 and 1b6b26ae ("pipe: fix and clarify pipe read/write wakeup logic"), but at the time it seems nobody noticed. Probably because the stress-ng problem case ends up being timing-dependent too. It was then unwittingly fixed by commit 3a34b13a ("pipe: make pipe writes always wake up readers") only to be broken again when by commit 3b844826 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads"). And at that point the kernel test robot noticed the performance refression in the stress-ng.sigio.ops_per_sec case. So the "Fixes" tag below is somewhat ad hoc, but it matches when the issue was noticed. Fix it for good (knock wood) by simply making the kill_fasync() case separate from the wakeup case. FASYNC is quite rare, and we clearly shouldn't even try to use the "avoid unnecessary wakeups" logic for it. Link: https://lore.kernel.org/lkml/20210824151337.GC27667@xsang-OptiPlex-9020/ Fixes: 3b844826 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads") Reported-by: Nkernel test robot <oliver.sang@intel.com> Tested-by: NOliver Sang <oliver.sang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Colin Ian King <colin.king@canonical.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Linus Torvalds 提交于
mainline inclusion from mainline-5.10.62 commit e91da23c1be16ebcfca0991976ed9377a8233935 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e91da23c1be16ebcfca0991976ed9377a8233935 -------------------------------- commit 3b844826 upstream. I had forgotten just how sensitive hackbench is to extra pipe wakeups, and commit 3a34b13a ("pipe: make pipe writes always wake up readers") ended up causing a quite noticeable regression on larger machines. Now, hackbench isn't necessarily a hugely meaningful benchmark, and it's not clear that this matters in real life all that much, but as Mel points out, it's used often enough when comparing kernels and so the performance regression shows up like a sore thumb. It's easy enough to fix at least for the common cases where pipes are used purely for data transfer, and you never have any exciting poll usage at all. So set a special 'poll_usage' flag when there is polling activity, and make the ugly "EPOLLET has crazy legacy expectations" semantics explicit to only that case. I would love to limit it to just the broken EPOLLET case, but the pipe code can't see the difference between epoll and regular select/poll, so any non-read/write waiting will trigger the extra wakeup behavior. That is sufficient for at least the hackbench case. Apart from making the odd extra wakeup cases more explicitly about EPOLLET, this also makes the extra wakeup be at the _end_ of the pipe write, not at the first write chunk. That is actually much saner semantics (as much as you can call any of the legacy edge-triggered expectations for EPOLLET "sane") since it means that you know the wakeup will happen once the write is done, rather than possibly in the middle of one. [ For stable people: I'm putting a "Fixes" tag on this, but I leave it up to you to decide whether you actually want to backport it or not. It likely has no impact outside of synthetic benchmarks - Linus ] Link: https://lore.kernel.org/lkml/20210802024945.GA8372@xsang-OptiPlex-9020/ Fixes: 3a34b13a ("pipe: make pipe writes always wake up readers") Reported-by: Nkernel test robot <oliver.sang@intel.com> Tested-by: NSandeep Patil <sspatil@android.com> Tested-by: NMel Gorman <mgorman@techsingularity.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Filipe Manana 提交于
mainline inclusion from mainline-5.10.62 commit d845f89d59fc3f17ea4e86321b82d8edf6c1719f bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d845f89d59fc3f17ea4e86321b82d8edf6c1719f -------------------------------- commit bc0939fc upstream. We have a race between marking that an inode needs to be logged, either at btrfs_set_inode_last_trans() or at btrfs_page_mkwrite(), and between btrfs_sync_log(). The following steps describe how the race happens. 1) We are at transaction N; 2) Inode I was previously fsynced in the current transaction so it has: inode->logged_trans set to N; 3) The inode's root currently has: root->log_transid set to 1 root->last_log_commit set to 0 Which means only one log transaction was committed to far, log transaction 0. When a log tree is created we set ->log_transid and ->last_log_commit of its parent root to 0 (at btrfs_add_log_tree()); 4) One more range of pages is dirtied in inode I; 5) Some task A starts an fsync against some other inode J (same root), and so it joins log transaction 1. Before task A calls btrfs_sync_log()... 6) Task B starts an fsync against inode I, which currently has the full sync flag set, so it starts delalloc and waits for the ordered extent to complete before calling btrfs_inode_in_log() at btrfs_sync_file(); 7) During ordered extent completion we have btrfs_update_inode() called against inode I, which in turn calls btrfs_set_inode_last_trans(), which does the following: spin_lock(&inode->lock); inode->last_trans = trans->transaction->transid; inode->last_sub_trans = inode->root->log_transid; inode->last_log_commit = inode->root->last_log_commit; spin_unlock(&inode->lock); So ->last_trans is set to N and ->last_sub_trans set to 1. But before setting ->last_log_commit... 8) Task A is at btrfs_sync_log(): - it increments root->log_transid to 2 - starts writeback for all log tree extent buffers - waits for the writeback to complete - writes the super blocks - updates root->last_log_commit to 1 It's a lot of slow steps between updating root->log_transid and root->last_log_commit; 9) The task doing the ordered extent completion, currently at btrfs_set_inode_last_trans(), then finally runs: inode->last_log_commit = inode->root->last_log_commit; spin_unlock(&inode->lock); Which results in inode->last_log_commit being set to 1. The ordered extent completes; 10) Task B is resumed, and it calls btrfs_inode_in_log() which returns true because we have all the following conditions met: inode->logged_trans == N which matches fs_info->generation && inode->last_subtrans (1) <= inode->last_log_commit (1) && inode->last_subtrans (1) <= root->last_log_commit (1) && list inode->extent_tree.modified_extents is empty And as a consequence we return without logging the inode, so the existing logged version of the inode does not point to the extent that was written after the previous fsync. It should be impossible in practice for one task be able to do so much progress in btrfs_sync_log() while another task is at btrfs_set_inode_last_trans() right after it reads root->log_transid and before it reads root->last_log_commit. Even if kernel preemption is enabled we know the task at btrfs_set_inode_last_trans() can not be preempted because it is holding the inode's spinlock. However there is another place where we do the same without holding the spinlock, which is in the memory mapped write path at: vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) { (...) BTRFS_I(inode)->last_trans = fs_info->generation; BTRFS_I(inode)->last_sub_trans = BTRFS_I(inode)->root->log_transid; BTRFS_I(inode)->last_log_commit = BTRFS_I(inode)->root->last_log_commit; (...) So with preemption happening after setting ->last_sub_trans and before setting ->last_log_commit, it is less of a stretch to have another task do enough progress at btrfs_sync_log() such that the task doing the memory mapped write ends up with ->last_sub_trans and ->last_log_commit set to the same value. It is still a big stretch to get there, as the task doing btrfs_sync_log() has to start writeback, wait for its completion and write the super blocks. So fix this in two different ways: 1) For btrfs_set_inode_last_trans(), simply set ->last_log_commit to the value of ->last_sub_trans minus 1; 2) For btrfs_page_mkwrite() only set the inode's ->last_sub_trans, just like we do for buffered and direct writes at btrfs_file_write_iter(), which is all we need to make sure multiple writes and fsyncs to an inode in the same transaction never result in an fsync missing that the inode changed and needs to be logged. Turn this into a helper function and use it both at btrfs_page_mkwrite() and at btrfs_file_write_iter() - this also fixes the problem that at btrfs_page_mkwrite() we were setting those fields without the protection of the inode's spinlock. This is an extremely unlikely race to happen in practice. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NAnand Jain <anand.jain@oracle.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Gerd Rausch 提交于
mainline inclusion from mainline-5.10.62 commit 6f38d95f33be52993033a2a893fc415ff9133196 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6f38d95f33be52993033a2a893fc415ff9133196 -------------------------------- [ Upstream commit fb4b1373 ] Function "dma_map_sg" is entitled to merge adjacent entries and return a value smaller than what was passed as "nents". Subsequently "ib_map_mr_sg" needs to work with this value ("sg_dma_len") rather than the original "nents" parameter ("sg_len"). This old RDS bug was exposed and reliably causes kernel panics (using RDMA operations "rds-stress -D") on x86_64 starting with: commit c588072b ("iommu/vt-d: Convert intel iommu driver to the iommu ops") Simply put: Linux 5.11 and later. Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Link: https://lore.kernel.org/r/60efc69f-1f35-529d-a7ef-da0549cad143@oracle.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ben Skeggs 提交于
mainline inclusion from mainline-5.10.62 commit b882dda2bf7a7b4eef9568c00f03365f77e2e17f bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b882dda2bf7a7b4eef9568c00f03365f77e2e17f -------------------------------- [ Upstream commit e78b1b54 ] Should fix some initial modeset failures on (at least) Ampere boards. Signed-off-by: NBen Skeggs <bskeggs@redhat.com> Reviewed-by: NLyude Paul <lyude@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ben Skeggs 提交于
mainline inclusion from mainline-5.10.62 commit 7f422cda03a62b7e9355ea8dded431c066dca3b1 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f422cda03a62b7e9355ea8dded431c066dca3b1 -------------------------------- [ Upstream commit 6eaa1f3c ] When booted with multiple displays attached, the EFI GOP driver on (at least) Ampere, can leave DP links powered up that aren't being used to display anything. This confuses our tracking of SOR routing, with the likely result being a failed modeset and display engine hang. Fix this by (ab?)using the DisableLT IED script to power-down the link, restoring HW to a state the driver expects. Signed-off-by: NBen Skeggs <bskeggs@redhat.com> Reviewed-by: NLyude Paul <lyude@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mark Yacoub 提交于
mainline inclusion from mainline-5.10.62 commit 6fd6e20520ccd05a1ac3321404dd498cc28576cb bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6fd6e20520ccd05a1ac3321404dd498cc28576cb -------------------------------- [ Upstream commit fa0b1ef5 ] [Why] Userspace should get back a copy of drm_wait_vblank that's been modified even when drm_wait_vblank_ioctl returns a failure. Rationale: drm_wait_vblank_ioctl modifies the request and expects the user to read it back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates the sequence to become current_vblank_count + sequence (which was RELATIVE), but now it became ABSOLUTE. drmWaitVBlank (in libdrm) expects this to be the case as it modifies the request to be Absolute so it expects the sequence to would have been updated. The change is in compat_drm_wait_vblank, which is called by drm_compat_ioctl. This change of copying the data back regardless of the return number makes it en par with drm_ioctl, which always copies the data before returning. [How] Return from the function after everything has been copied to user. Fixes IGT:kms_flip::modeset-vs-vblank-race-interruptible Tested on ChromeOS Trogdor(msm) Reviewed-by: NMichel Dänzer <mdaenzer@redhat.com> Signed-off-by: NMark Yacoub <markyacoub@chromium.org> Signed-off-by: NSean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210812194917.1703356-1-markyacoub@chromium.orgSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ming Lei 提交于
mainline inclusion from mainline-5.10.62 commit 26ee94ba343c63d9d23112521c68fa72c82a8805 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=26ee94ba343c63d9d23112521c68fa72c82a8805 -------------------------------- [ Upstream commit c797b40c ] Inside blk_mq_queue_tag_busy_iter() we already grabbed request's refcount before calling ->fn(), so needn't to grab it one more time in blk_mq_check_expired(). Meantime remove extra request expire check in blk_mq_check_expired(). Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20210811155202.629575-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kenneth Feng 提交于
mainline inclusion from mainline-5.10.62 commit b00ca567579a4c2f9a4cd6f9a63946f793e8b506 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b00ca567579a4c2f9a4cd6f9a63946f793e8b506 -------------------------------- [ Upstream commit 93c5701b ] change the workload type for some cards as it is needed. Signed-off-by: NKenneth Feng <kenneth.feng@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kenneth Feng 提交于
mainline inclusion from mainline-5.10.62 commit 3c37ec4350220a548ffc6753646913899e86b1c7 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c37ec4350220a548ffc6753646913899e86b1c7 -------------------------------- [ Upstream commit 2fd31689 ] This reverts commit 0979d432. Revert this because it does not apply to all the cards. Signed-off-by: NKenneth Feng <kenneth.feng@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shai Malin 提交于
mainline inclusion from mainline-5.10.62 commit cc126b400b25f7fa0d86d02b9cdbaa45759566aa bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc126b400b25f7fa0d86d02b9cdbaa45759566aa -------------------------------- [ Upstream commit d33d19d3 ] Fix a possible null-pointer dereference in qed_rdma_create_qp(). Changes from V2: - Revert checkpatch fixes. Reported-by: NTOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: NAriel Elior <aelior@marvell.com> Signed-off-by: NShai Malin <smalin@marvell.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shai Malin 提交于
mainline inclusion from mainline-5.10.62 commit 18a65ba06903862a5c64ff66f464a57f488a7298 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18a65ba06903862a5c64ff66f464a57f488a7298 -------------------------------- [ Upstream commit 37110237 ] Avoiding qed ll2 race condition and NULL pointer dereference as part of the remove and recovery flows. Changes form V1: - Change (!p_rx->set_prod_addr). - qed_ll2.c checkpatch fixes. Change from V2: - Revert "qed_ll2.c checkpatch fixes". Signed-off-by: NAriel Elior <aelior@marvell.com> Signed-off-by: NShai Malin <smalin@marvell.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Michael S. Tsirkin 提交于
mainline inclusion from mainline-5.10.62 commit 4ac9c81e8a541dd3fb53127cb9184a0d79341e38 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4ac9c81e8a541dd3fb53127cb9184a0d79341e38 -------------------------------- [ Upstream commit a24ce06c ] We use a spinlock now so add a stub. Ignore bogus uninitialized variable warnings. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Neeraj Upadhyay 提交于
mainline inclusion from mainline-5.10.62 commit c7ee4d22614e43ae1d633864b70ac075469ad27f bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c7ee4d22614e43ae1d633864b70ac075469ad27f -------------------------------- [ Upstream commit e74cfa91 ] As __vringh_iov() traverses a descriptor chain, it populates each descriptor entry into either read or write vring iov and increments that iov's ->used member. So, as we iterate over a descriptor chain, at any point, (riov/wriov)->used value gives the number of descriptor enteries available, which are to be read or written by the device. As all read iovs must precede the write iovs, wiov->used should be zero when we are traversing a read descriptor. Current code checks for wiov->i, to figure out whether any previous entry in the current descriptor chain was a write descriptor. However, iov->i is only incremented, when these vring iovs are consumed, at a later point, and remain 0 in __vringh_iov(). So, correct the check for read and write descriptor order, to use wiov->used. Acked-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com> Signed-off-by: NNeeraj Upadhyay <neeraju@codeaurora.org> Link: https://lore.kernel.org/r/1624591502-4827-1-git-send-email-neeraju@codeaurora.orgSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vincent Whitchurch 提交于
mainline inclusion from mainline-5.10.62 commit 6c074eaaf7855dfee8faa8a093940fff5e779ec3 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6c074eaaf7855dfee8faa8a093940fff5e779ec3 -------------------------------- [ Upstream commit cb5d2c1f ] Do not call vDPA drivers' callbacks with vq indicies larger than what the drivers indicate that they support. vDPA drivers do not bounds check the indices. Signed-off-by: NVincent Whitchurch <vincent.whitchurch@axis.com> Link: https://lore.kernel.org/r/20210701114652.21956-1-vincent.whitchurch@axis.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Parav Pandit 提交于
mainline inclusion from mainline-5.10.62 commit 0698278e8eefb22660b6fa27b002b4232131c146 bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0698278e8eefb22660b6fa27b002b4232131c146 -------------------------------- [ Upstream commit 43bb40c5 ] When a virtio pci device undergo surprise removal (aka async removal in PCIe spec), mark the device as broken so that any upper layer drivers can abort any outstanding operation. When a virtio net pci device undergo surprise removal which is used by a NetworkManager, a below call trace was observed. kernel:watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [kworker/1:1:27059] watchdog: BUG: soft lockup - CPU#1 stuck for 52s! [kworker/1:1:27059] CPU: 1 PID: 27059 Comm: kworker/1:1 Tainted: G S W I L 5.13.0-hotplug+ #8 Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.9.4 11/06/2020 Workqueue: events linkwatch_event RIP: 0010:virtnet_send_command+0xfc/0x150 [virtio_net] Call Trace: virtnet_set_rx_mode+0xcf/0x2a7 [virtio_net] ? __hw_addr_create_ex+0x85/0xc0 __dev_mc_add+0x72/0x80 igmp6_group_added+0xa7/0xd0 ipv6_mc_up+0x3c/0x60 ipv6_find_idev+0x36/0x80 addrconf_add_dev+0x1e/0xa0 addrconf_dev_config+0x71/0x130 addrconf_notify+0x1f5/0xb40 ? rtnl_is_locked+0x11/0x20 ? __switch_to_asm+0x42/0x70 ? finish_task_switch+0xaf/0x2c0 ? raw_notifier_call_chain+0x3e/0x50 raw_notifier_call_chain+0x3e/0x50 netdev_state_change+0x67/0x90 linkwatch_do_dev+0x3c/0x50 __linkwatch_run_queue+0xd2/0x220 linkwatch_event+0x21/0x30 process_one_work+0x1c8/0x370 worker_thread+0x30/0x380 ? process_one_work+0x370/0x370 kthread+0x118/0x140 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Hence, add the ability to abort the command on surprise removal which prevents infinite loop and system lockup. Signed-off-by: NParav Pandit <parav@nvidia.com> Link: https://lore.kernel.org/r/20210721142648.1525924-5-parav@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Parav Pandit 提交于
mainline inclusion from mainline-5.10.62 commit 065a13c299b493ba63d526bbba4e44b2dbc2962e bugzilla: 182217 https://gitee.com/openeuler/kernel/issues/I4EFOS Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=065a13c299b493ba63d526bbba4e44b2dbc2962e -------------------------------- [ Upstream commit 60f07798 ] Currently vq->broken field is read by virtqueue_is_broken() in busy loop in one context by virtnet_send_command(). vq->broken is set to true in other process context by virtio_break_device(). Reader and writer are accessing it without any synchronization. This may lead to a compiler optimization which may result to optimize reading vq->broken only once. Hence, force reading vq->broken on each invocation of virtqueue_is_broken() and also force writing it so that such update is visible to the readers. It is a theoretical fix that isn't yet encountered in the field. Signed-off-by: NParav Pandit <parav@nvidia.com> Link: https://lore.kernel.org/r/20210721142648.1525924-2-parav@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-