- 09 4月, 2021 40 次提交
-
-
由 Joakim Zhang 提交于
stable inclusion from stable-5.10.24 commit 98b7f969116df96c57e9a8572620d71e92fcb725 bugzilla: 51348 -------------------------------- commit 449052cf upstream. Assert HALT bit to enter freeze mode, there is a premise that FRZ bit is asserted. This patch asserts FRZ bit in flexcan_chip_freeze, although the reset value is 1b'1. This is a prepare patch, later patch will invoke flexcan_chip_freeze() to enter freeze mode, which polling freeze mode acknowledge. Fixes: b1aa1c7a ("can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze") Link: https://lore.kernel.org/r/20210218110037.16591-2-qiangqing.zhang@nxp.comSigned-off-by: NJoakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Oleksij Rempel 提交于
stable inclusion from stable-5.10.24 commit 4224890edff1b4679dc8ddeaa69b43efce5366ba bugzilla: 51348 -------------------------------- commit e940e089 upstream. There are two ref count variables controlling the free()ing of a socket: - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put() - struct sock::sk_wmem_alloc - which accounts the memory allocated by the skbs in the send path. In case there are still TX skbs on the fly and the socket() is closed, the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack clones an "echo" skb, calls sock_hold() on the original socket and references it. This produces the following back trace: | WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134 | refcount_t: addition on 0; use-after-free. | Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E) | CPU: 0 PID: 280 Comm: test_can.sh Tainted: G E 5.11.0-04577-gf8ff6603c617 #203 | Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) | Backtrace: | [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220 | [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8) | [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90 | [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2 | [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540 | [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50) | [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c) | [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600 | [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00 | [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600 | [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400 | [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534) To fix this problem, only set skb ownership to sockets which have still a ref count > 0. Fixes: 0ae89beb ("can: add destructor for self generated skbs") Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Andre Naujoks <nautsch2@gmail.com> Link: https://lore.kernel.org/r/20210226092456.27126-1-o.rempel@pengutronix.deSuggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: NOliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Matthias Schiffer 提交于
stable inclusion from stable-5.10.24 commit fa5d019c56e78e0b33f585d23149f2553568b998 bugzilla: 51348 -------------------------------- commit 3e59e885 upstream. Commit 5ee759cd ("l2tp: use standard API for warning log messages") changed a number of warnings about invalid packets in the receive path so that they are always shown, instead of only when a special L2TP debug flag is set. Even with rate limiting these warnings can easily cause significant log spam - potentially triggered by a malicious party sending invalid packets on purpose. In addition these warnings were noticed by projects like Tunneldigger [1], which uses L2TP for its data path, but implements its own control protocol (which is sufficiently different from L2TP data packets that it would always be passed up to userspace even with future extensions of L2TP). Some of the warnings were already redundant, as l2tp_stats has a counter for these packets. This commit adds one additional counter for invalid packets that are passed up to userspace. Packets with unknown session are not counted as invalid, as there is nothing wrong with the format of these packets. With the additional counter, all of these messages are either redundant or benign, so we reduce them to pr_debug_ratelimited(). [1] https://github.com/wlanslovenija/tunneldigger/issues/160 Fixes: 5ee759cd ("l2tp: use standard API for warning log messages") Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Balazs Nemeth 提交于
stable inclusion from stable-5.10.24 commit 453fff24f52eeb62ab65582848498097273df269 bugzilla: 51348 -------------------------------- commit d348ede3 upstream. A packet with skb_inner_network_header(skb) == skb_network_header(skb) and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers from the packet. Subsequently, the call to skb_mac_gso_segment will again call mpls_gso_segment with the same packet leading to an infinite loop. In addition, ensure that the header length is a multiple of four, which should hold irrespective of the number of stacked labels. Signed-off-by: NBalazs Nemeth <bnemeth@redhat.com> Acked-by: NWillem de Bruijn <willemb@google.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Balazs Nemeth 提交于
stable inclusion from stable-5.10.24 commit faa3baa2828c5e1c4374f3e60041f75c64f5fcb6 bugzilla: 51348 -------------------------------- commit 924a9bc3 upstream. For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't set) based on the type in the virtio net hdr, but the skb could contain anything since it could come from packet_snd through a raw socket. If there is a mismatch between what virtio_net_hdr_set_proto sets and the actual protocol, then the skb could be handled incorrectly later on. An example where this poses an issue is with the subsequent call to skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set correctly. A specially crafted packet could fool skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned. Avoid blindly trusting the information provided by the virtio net header by checking that the protocol in the packet actually matches the protocol set by virtio_net_hdr_set_proto. Note that since the protocol is only checked if skb->dev implements header_ops->parse_protocol, packets from devices without the implementation are not checked at this stage. Fixes: 9274124f ("net: stricter validation of untrusted gso packets") Signed-off-by: NBalazs Nemeth <bnemeth@redhat.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
stable inclusion from stable-5.10.24 commit 09af4362ba47c805347840c2bb9719c0458925ca bugzilla: 51348 -------------------------------- commit 89e5c58f upstream. We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the csum for the UDP header itself is 0. In that case, GRO aggregation does not take place on the phys dev, but instead is deferred to the vxlan/geneve driver (see trace below). The reason is essentially that GRO aggregation bails out in udp_gro_receive() for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, others) where for non-zero csums 2abb7cdc ("udp: Add support for doing checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE and napi context has csum_valid set. This is however not the case for zero UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). At the same time 57c67ff4 ("udp: additional GRO support") added matches on !uh->check ^ !uh2->check as part to determine candidates for aggregation, so it certainly is expected to handle zero csums in udp_gro_receive(). The purpose of the check added via 662880f4 ("net: Allow GRO to use and set levels of checksum unnecessary") seems to catch bad csum and stop aggregation right away. One way to fix aggregation in the zero case is to only perform the !csum_valid check in udp_gro_receive() if uh->check is infact non-zero. Before: [...] swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100400 len=1500 (1) swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100200 len=1500 swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101100 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101700 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101b00 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100600 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100f00 len=1500 swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100a00 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100500 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100700 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101d00 len=1500 (2) swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101000 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101c00 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101400 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100e00 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101600 len=1500 swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100800 len=774 swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1) swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112 (2) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 13129.24 After: [...] swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479000 len=11286 (1) swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d478500 len=2898 (2) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479f00 len=8490 (3) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848 (2) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440 (3) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 24576.53 Fixes: 57c67ff4 ("udp: additional GRO support") Fixes: 662880f4 ("net: Allow GRO to use and set levels of checksum unnecessary") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tom Herbert <tom@herbertland.com> Acked-by: NWillem de Bruijn <willemb@google.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20210226212248.8300-1-daniel@iogearbox.netSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Felix Fietkau 提交于
stable inclusion from stable-5.10.24 commit d2fb1911a7a8f655440d613fc8946df384d83ee5 bugzilla: 51348 -------------------------------- commit 3b9ea720 upstream. When transmitting to a receiver in dynamic SMPS mode, all transmissions that use multiple spatial streams need to be sent using CTS-to-self or RTS/CTS to give the receiver's extra chains some time to wake up. This fixes the tx rate getting stuck at <= MCS7 for some clients, especially Intel ones, which make aggressive use of SMPS. Cc: stable@vger.kernel.org Reported-by: NMartin Kennedy <hurricos@gmail.com> Signed-off-by: NFelix Fietkau <nbd@nbd.name> Signed-off-by: NKalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210214184911.96702-1-nbd@nbd.nameSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Maciej W. Rozycki 提交于
stable inclusion from stable-5.10.24 commit b0454a28f60878539a55439436ea9ad29728d366 bugzilla: 51348 -------------------------------- commit 6c810cf2 upstream. The MIPS Poly1305 implementation is generic MIPS code written such as to support down to the original MIPS I and MIPS III ISA for the 32-bit and 64-bit variant respectively. Lift the current limitation then to enable code for MIPSr1 ISA or newer processors only and have it available for all MIPS processors. Signed-off-by: NMaciej W. Rozycki <macro@orcam.me.uk> Fixes: a11d055e ("crypto: mips/poly1305 - incorporate OpenSSL/CRYPTOGAMS optimized implementation") Cc: stable@vger.kernel.org # v5.5+ Acked-by: NJason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jakub Kicinski 提交于
stable inclusion from stable-5.10.24 commit a0df424a863aa6a2e8bd57ef5e0928da5d5b797f bugzilla: 51348 -------------------------------- commit a4dcfbc4 upstream. netif_device_attach() will unpause the queues so we can't call it before __alx_open(). This went undetected until commit b0999223 ("alx: add ability to allocate and free alx_napi structures") but now if stack tries to xmit immediately on resume before __alx_open() we'll crash on the NAPI being null: BUG: kernel NULL pointer dereference, address: 0000000000000198 CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G OE 5.10.0-3-amd64 #1 Debian 5.10.13-1 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77-D3H, BIOS F15 11/14/2013 RIP: 0010:alx_start_xmit+0x34/0x650 [alx] Code: 41 56 41 55 41 54 55 53 48 83 ec 20 0f b7 57 7c 8b 8e b0 0b 00 00 39 ca 72 06 89 d0 31 d2 f7 f1 89 d2 48 8b 84 df RSP: 0018:ffffb09240083d28 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffffa04d80ae7800 RCX: 0000000000000004 RDX: 0000000000000000 RSI: ffffa04d80afa000 RDI: ffffa04e92e92a00 RBP: 0000000000000042 R08: 0000000000000100 R09: ffffa04ea3146700 R10: 0000000000000014 R11: 0000000000000000 R12: ffffa04e92e92100 R13: 0000000000000001 R14: ffffa04e92e92a00 R15: ffffa04e92e92a00 FS: 0000000000000000(0000) GS:ffffa0508f600000(0000) knlGS:0000000000000000 i915 0000:00:02.0: vblank wait timed out on crtc 0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000198 CR3: 000000004460a001 CR4: 00000000001706f0 Call Trace: dev_hard_start_xmit+0xc7/0x1e0 sch_direct_xmit+0x10f/0x310 Cc: <stable@vger.kernel.org> # 4.9+ Fixes: bc2bebe8 ("alx: remove WoL support") Reported-by: NZbynek Michl <zbynek.michl@gmail.com> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983595Signed-off-by: NJakub Kicinski <kuba@kernel.org> Tested-by: NZbynek Michl <zbynek.michl@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Greg Kurz 提交于
stable inclusion from stable-5.10.24 commit a9c55f22a0b978d636204509c4edaf511cb20f62 bugzilla: 51348 -------------------------------- commit f9619d5e upstream. Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit 9ea69a55 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes: 9ea69a55 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: NGreg Kurz <groug@kaod.org> Reviewed-by: NLaurent Vivier <lvivier@redhat.com> Reviewed-by: NCédric Le Goater <clg@kaod.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Athira Rajeev 提交于
stable inclusion from stable-5.10.24 commit ac022fbee6855dc6304a9e63e481859b2589836d bugzilla: 51348 -------------------------------- commit 5ae5fbd2 upstream. Running "perf mem record" in powerpc platforms with selinux enabled resulted in soft lockup's. Below call-trace was seen in the logs: CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2 NIP: c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000 REGS: c000007fffab7d60 TRAP: 0100 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x94/0x120 LR _raw_spin_lock_irqsave+0x90/0x120 Call Trace: 0xc00000000fd47260 (unreliable) skb_queue_tail+0x3c/0x90 audit_log_end+0x6c/0x180 common_lsm_audit+0xb0/0xe0 slow_avc_audit+0xa4/0x110 avc_has_perm+0x1c4/0x260 selinux_perf_event_open+0x74/0xd0 security_perf_event_open+0x68/0xc0 record_and_restart+0x6e8/0x7f0 perf_event_interrupt+0x22c/0x560 performance_monitor_exception0x4c/0x60 performance_monitor_common_virt+0x1c8/0x1d0 interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120 NIP: c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0 REGS: c00000000fd47860 TRAP: 0f00 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x38/0x120 LR skb_queue_tail+0x3c/0x90 interrupt: f00 0x38 (unreliable) 0xc00000000aae6200 audit_log_end+0x6c/0x180 audit_log_exit+0x344/0xf80 __audit_syscall_exit+0x2c0/0x320 do_syscall_trace_leave+0x148/0x200 syscall_exit_prepare+0x324/0x390 system_call_common+0xfc/0x27c The above trace shows that while the CPU was handling a performance monitor exception, there was a call to security_perf_event_open() function. In powerpc core-book3s, this function is called from perf_allow_kernel() check during recording of data address in the sample via perf_get_data_addr(). Commit da97e184 ("perf_event: Add support for LSM and SELinux checks") introduced security enhancements to perf. As part of this commit, the new security hook for perf_event_open() was added in all places where perf paranoid check was previously used. In powerpc core-book3s code, originally had paranoid checks in perf_get_data_addr() and power_pmu_bhrb_read(). So perf_paranoid_kernel() checks were replaced with perf_allow_kernel() in these PMU helper functions as well. The intention of paranoid checks in core-book3s was to verify privilege access before capturing some of the sample data. Along with paranoid checks, perf_allow_kernel() also does a security_perf_event_open(). Since these functions are accessed while recording a sample, we end up calling selinux_perf_event_open() in PMI context. Some of the security functions use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under a spin lock and if we end up in calling selinux hook functions in PMI handler, this could cause a dead lock. Since the purpose of this security hook is to control access to perf_event_open(), it is not right to call this in interrupt context. The paranoid checks in powerpc core-book3s were done at interrupt time which is also not correct. Reference commits: Commit cd1231d7 ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Commit bb19af81 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer") We only allow creation of events that have already passed the privilege checks in perf_event_open(). So these paranoid checks are not needed at event time. As a fix, patch uses 'event->attr.exclude_kernel' check to prevent exposing kernel address for userspace only sampling. Fixes: cd1231d7 ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Cc: stable@vger.kernel.org # v4.17+ Suggested-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1614247839-1428-1-git-send-email-atrajeev@linux.vnet.ibm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dmitry V. Levin 提交于
stable inclusion from stable-5.10.24 commit 7732f57f0f523509b0b405ad2a0271f4016a4b45 bugzilla: 51348 -------------------------------- commit c33cb002 upstream. Apparently, <linux/netfilter/nfnetlink_cthelper.h> and <linux/netfilter/nfnetlink_acct.h> could not be included into the same compilation unit because of a cut-and-paste typo in the former header. Fixes: 12f7a505 ("netfilter: add user-space connection tracking helper infrastructure") Cc: <stable@vger.kernel.org> # v3.6 Signed-off-by: NDmitry V. Levin <ldv@altlinux.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Ming 提交于
openEuler inclusion category: bugfix bugzilla: 48265 CVE: NA Reference: https://gitee.com/openeuler/kernel/issues/I3BPPX --------------------------------------------------- The default branch in switch will not run at present, but there may be related extensions in the future, which may lead to memory leakage. Signed-off-by: Zhang Ming <154842638(a)qq.com> Reported-by: Wang ShaoBo <bobo.shaobowang(a)huawei.com> Suggested-by: Jian Cheng <cj.chengjian(a)huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> [Zheng Zengkai: adjust commit message] Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: 48265 CVE: NA -------------------------------- Workaround cacheinfo's info_list uninitialized error in some special cases, such as free_cache_attributes() free info_list but not set num_leaves to zero when PPTT is not supported. this solution lasts until upstream issue resolved. Fixes: 950e5edb ("drivers: base: cacheinfo: Add helper to search cacheinfo by of_node") Fixes: 709c4362 ("cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NJian Cheng <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Enable MPAM by default. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: 48265 CVE: NA -------------------------------- When cpu online, domains inserted into resctrl_resource structure's domains list may be out of order, so sort them with domain id. Fixes: 2e2c511ff49d ("arm64/mpam: resctrl: Handle cpuhp and resctrl_dom allocation") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NJian Cheng <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: 48265 CVE: NA -------------------------------- This fixes two problems: 1) when cpu offline, we should clear cpu mask from all associated resctrl group but not only default group. 2) when cpu online, we should set cpu mask for default group and update default group's cpus to default state if cdp on, this operation is to fill code and data fields of mpam sysregs with appropriate value. Fixes: 2e2c511ff49d ("arm64/mpam: resctrl: Handle cpuhp and resctrl_dom allocation") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NJian Cheng <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: 48265 CVE: NA -------------------------------- Unlike mbw max(Memory Bandwidth Maximum), sometimes we don't want make use of mbw min feature(this for restrict memory bandwidth maximum capacity partition by using MPAMCFG_MBW_MIN, MBMIN row in schemata) and set MPAMCFG_MBW_MIN to 0. e.g. > mount -t resctrl resctrl /sys/fs/resctrl/ -o mbMin > cd resctrl/ && cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMIN:0=0;1=0;2=0;3=0 # before revision > echo 'MBMIN:0=0;1=0;2=0;3=0' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMIN:0=2;1=2;2=2;3=2 # after revision > echo 'MBMIN:0=0;1=0;2=0;3=0' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMIN:0=0;1=0;2=0;3=0 Fixes: 5a49c4f1983d ("arm64/mpam: Supplement additional useful ctrl features for mount options") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: 48265 CVE: NA -------------------------------- When we support configure different types of resources for a resource, the wrong history value will be updated in the default group after remounting. e.g. > mount -t resctrl resctrl /sys/fs/resctrl/ -o mbMax,mbMin && cd resctrl/ > echo 'MBMIN:0=2;1=2;2=2;3=2' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMAX:0=100;1=100;2=100;3=100 MBMIN:0=2;1=2;2=2;3=2 > cd .. && umount /sys/fs/resctrl/ > mount -t resctrl resctrl /sys/fs/resctrl/ -o mbMax,mbMin && cd resctrl/ && cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMAX:0=100;1=100;2=100;3=100 MBMIN:0=0;1=0;2=0;3=0 > echo 'MBMAX:0=10;1=10;2=10;3=10' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMAX:0=10;1=10;2=10;3=10 MBMIN:0=2;1=2;2=2;3=2 #update error history value When writing schemata sysfile, call path like this: resctrl_group_schemata_write() -=> resctrl_update_groups_config() -=> resctrl_group_update_domains() -=> resctrl_group_update_domain_ctrls() { .../*refresh new_ctrl array of supported conf type once for each resource*/ } We should refresh new_ctrl field in struct resctrl_staged_config by resctrl_group_init_alloc() before calling resctrl_group_update_domain_ctrls(). Fixes: 6b2471f089be ("arm64/mpam: resctrl: Support priority and hardlimit(Memory bandwidth) configuration") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: 48265 CVE: NA -------------------------------- This function is called only when we mount resctrl sysfs, for error handling we need to destroy schemata list when next few steps failed after creation of schemata list. Fixes: 7e9b5caeefff ("arm64/mpam: resctrl: Add helpers for init and destroy schemata list") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: 48265 CVE: NA -------------------------------- Use fs_context to parse mount options, this old process parsing from parse_rdtgroupfs_options() will be obsoleted and removed. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Based on 61fa56e1dd8a ("arm64/mpam: Add resctrl_ctrl_feature structure to manage ctrl features"), we add several ctrl features and supply corresponding mount options, including mbPbm, mbMax, mbMin, mbPrio, caMax, caPrio, caPbm, if MPAM system supports relevant features, we can mount resctrl like this: e.g. > mount -t resctrl resctrl /sys/fs/resctrl -o mbMax,mbMin,caPrio > cd /sys/fs/resctrl && cat schemata L3:0=0x7fff;1=0x7fff;2=0x7fff;3=0x7fff #default select cpbm as basic ctrl feature L3PRI:0=3;1=3;2=3;3=3 MBMAX:0=100;1=100;2=100;3=100 MBMIN:0=0;1=0;2=0;3=0 > mount -t resctrl resctrl /sys/fs/resctrl > cd /sys/fs/resctrl && cat schemata L3:0=0x7fff;1=0x7fff;2=0x7fff;3=0x7fff #default select cpbm as basic ctrl feature MB:0=100;1=100;2=100;3=100 #default select mbw max as basic ctrl feature > mount -t resctrl resctrl /sys/fs/resctrl -o caMax > cd /sys/fs/resctrl && cat schemata L3:0=33554432;1=33554432;2=33554432;3=33554432 #use cmax ctrl feature MB:0=100;1=100;2=100;3=100 #default select mbw max as basic ctrl feature For Cache MSCs, basic ctrl features include cmax(Cache Maximum Capacity) and cpbm(Cache protion bitmap) partition, if mount options are not specified, default cpbm will be selected. For Memory MSCs, basic ctrl features include max(Memory Bandwidth Maximum) and pbm(Memory Bandwidth Portion Bitmap) partition, if mount options are not specified, default max will be selected. Above mount options also can be used accompany with cdp options. e.g. > mount -t resctrl resctrl /sys/fs/resctrl -o caMax,caPrio,cdpl3 > cd /sys/fs/resctrl && cat schemata L3CODE:0=33554432;1=33554432;2=33554432;3=33554432 #code use cmax ctrl feature L3DATA:0=33554432;1=33554432;2=33554432;3=33554432 #data use cmax ctrl feature L3CODEPRI:0=3;1=3;2=3;3=3 #code use intpriority ctrl feature L3DATAPRI:0=3;1=3;2=3;3=3 #data use intpriority ctrl feature MB:0=100;1=100;2=100;3=100 #default select mbw max as basic ctrl feature By combining these mount parameters can we use MPAM more powerfully. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Sometimes monitoring will have such anomalies: e.g. > cd /sys/fs/resctrl/ && grep . mon_data/* mon_data/mon_L3CODE_00:14336 mon_data/mon_L3CODE_01:344064 mon_data/mon_L3CODE_02:2048 mon_data/mon_L3CODE_03:27648 mon_data/mon_L3DATA_00:0 #L3DATA's monitoring data always be 0 mon_data/mon_L3DATA_01:0 mon_data/mon_L3DATA_02:0 mon_data/mon_L3DATA_03:0 mon_data/mon_MB_00:392 mon_data/mon_MB_01:552 mon_data/mon_MB_02:160 mon_data/mon_MB_03:0 If cdp on, tasks in resctrl default group with closid=0 and rmid=0 don't know how to fill proper partid_i/pmg_i and partid_d/pmg_d into MPAMx_ELx sysregs by mpam_sched_in() called by __switch_to(), it's because current cpu's default closid and rmid are also equal to 0 and to make the operation modifying configuration passed. Update per cpu default closid of none-zero value, call update_closid_rmid() to update each cpu's mpam proper MPAMx_ELx sysregs for setting partid and pmg when mounting resctrl sysfs, it looks like a practical method. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- MPAM includes partid, pmg, monitor, all of these we collectively call mpam id, if cdp on, we would allocate a new mpamid_new which equals to mpamid + 1, and at some places mpamid may not need to be encapsulated into struct { u16 val; } for simplicity, So we use a simpler macro resctrl_cdp_mpamid_map_val() to complete this cdp mapping process. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- ctrl_features array, introduced by 61fa56e1dd8a ("arm64/mpam: Add resctrl_ctrl_feature structure to manage ctrl features"), which lives in raw_resctrl_resource structure for listing ctrl features's type do we support in total for this resource, this filters illegal parameters outside from mount options and provides useful info for add_schema() for registering a new control type node in schema list. This action helps us to add new ctrl feature easier later. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- rmid is used to mark each resctrl group for monitoring, anyhow, also following corresponding resctrl group's configuration, we export rmid sysfile to resctrl sysfs for any usage elsewhere such as SMMU io, user can get rmid from a resctrl group and set this rmid to a target io through SMMU driver if SMMU MPAM implemented, so make related io devices can be monitored or accomplish aimed configuration for resource's usage. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- So far there are some declarations shared by resctrlfs.c and mpam core module files under kernel/mpam directory scattered in mpam.h and resctrl.h, this is organized like this: -- asm/ +-- resctrl.h + +-- mpam.h | + +-- mpam_resource.h | | + | | | -- fs/ | | +-> mpam/ +-- resctrlfs.c <----+----+------> +-- mpam_resctrl.c ... We move this declarations shared by resctrlfs.c and mpam/ to resctrl.h and split another declarations into mpam_internal.h, also including moving mpam_resource.h to mpam/ directory, currently this is organized like this: -- asm/ +-- mpam.h +----> export to other modules(e.g. SMMU master io) +-- resctrl.h + | -- mpam/ | +-- mpam_internal.h | + +-- mpam_resource.h | | + | | | -- fs/ | +----+-> mpam/ +-- resctrlfs.c <----+-----------> +-- mpam_resctrl.c ... In this way can we build a clearer framework for MPAM usage. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Some resource's properities such as closid and rmid are exported like Intel-RDT in our resctrl design, but there also has two main differences, one is MB(Memory Bandwidth), for we MB is also divided into two directories MB and MB_MON to show respective properties about control and monitor type as same as LxCache, another is we adopt features sysfile under resources' directories, which indicates the properties of control type of corresponding resource, for instance MB hardlimit. e.g. > mount -t resctrl resctrl /sys/fs/resctrl -o mbHdl > cd /sys/fs/resctrl/ && cat info/MB/features mbHdl@1 #indicate MBHDL setting's upper bound is 1 > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MB:0=100;1=100;2=100;3=100 MBHDL:0=1;1=1;2=1;3=1 Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Structure resctrl_ctrl_feature taken by resources is introduced to manage ctrl features, of which characteristic like max width from outer input and the base we parse from. Now it is more practical for declaring a new ctrl feature, such as SCHEMA_PRI feature, only associated with internal priority setting exported by mpam devices, where informations is collected from mpam_resctrl_resource_init(), and next be chosen open or close by user options. ctrl_ctrl_feature structure contains a flags field to avoid duplicated control type, for instance, SCHEMA_COMM feature selectes cpbm (Cache portion bitmap) as resource Cache default control type, so we should not enable this feature no longer if user manually selectes cpbm control type through mount options. This field evt in ctrl_ctrl_feature structure is enum rdt_event_id type variable which works like eee4ad2a36e6 ("arm64/mpam: Add hook-events id for ctrl features") illustrates. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- For MPAM, a rmid can do monitoring work only with a monitor resource allocated, we adopt a mechanism for monitor resource dynamic allocation and recycling, it is different from Intel-RDT operation who creates a kworker thread for dynamically monitoring Cache usage and checks if it is below a threshold adjustable for rmid free, for we have detected that this method will affect the cpu utilization in many cases, sometimes this influence cannot be accepted. Our method is simple, as different resource's monitor number varies, we deliever two list, one for storing rmids which has exclusive monitor resource and another for storing this rmids which have monitor resource shared, this shared monitor id always be 0. it works like this, if a new rmid apply for a resource monitor which is in used, then we put this rmid to the tail of latter list and temporarily give a default monitor id 0 util someone releases available monitor resource, if this new rmid has all resources' monitor resource needed, then it will be put into exclusive list. This implements the LRU allocation of monitor resources and give users part control rights of allocation and release, if resctrl group's quantity can be guaranteed or user don't need monitoring too many groups synchronously, this is a more appropriate way for user deployment, not only that, also can it avoid the risk of inaccuracy in monitoring when monitoring operation happen to too many groups at the same time. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- So far we use sd_closid, including {reqpartid, intpartid}, to label each resctrl group including ctrlgroup and mongroup, This can perfectly handle this case where number of reqpartid exceeds intpartid, this always happen when intpartid narrowing supported, otherwise their two are of same number. So we use excessive reqpartid to indicate (1)- how configurations can be synchronized from the configuration indexed by intpartid, not only that, (2)- take part of monitor role. But reqpartid in (2) with pmg still be scattered, So far we have not yet a right way to explain how can we use their two properly. In order to ensure their resources can be fully utilized, and given this idea from Intel-RDT's design which uses rmid for monitoring, a rmid remap matrix is delivered for transforming partid and pmg to rmid, this matrix is organized like this: [bitmap entry indexed by partid] [col pos is partid] [0] [1] [2] [3] [4] [5] occ->bitmap[:0] 1 0 0 1 1 1 bitmap[:1] 1 0 0 1 1 1 bitmap[:2] 1 1 1 1 1 1 bitmap[:3] 1 1 1 1 1 1 [row pos-1 is pmg] Calculate rmid = partid + NR_partid * pmg occ represents if this bitmap has been used by a partid, it is because a certain partid should not be accompany with a duplicated pmg for monitoring, this design easily saves a lot of space, and can also decrease time complexity of allocating and free rmid process from O(NR_partid)* O(NR_pmg) to O(NR_partid) + O(log(NR_pmg)) compared with using list. By this way, we get a continuous rmid set with upper bound(NR_pmg * NR_partid - 1), given an rmid we can assume that if it's a valid rmid by judging whether it falls within this range or not. rmid implicts the reqpartid info, so we can use relevant helpers to get this reqpartid for sd_closid@reqpartid and perfectly accomplish this configuration sync mission, this also makes closid simpler which can be consists of intpartid index only, also each resctrl group is happy to own consecutive rmid. This also has some profound influences, for instance for MPAM there also support SMMU io using partid and pmg, we can use a single helper mpam_rmid_to_partid_pmg() in SMMU driver to complete this remap process for rmid input from outside user space. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- There are two aspects involved: - Getting configuration We divide event QOS_XX_PRI_EVENT_ID into QOS_XX_INTPRI_EVENT_ID and QOS_XX_DSPRI_EVENT_ID, in spite of having attempted to set same value of filling dspri and intpti in mpam_config structure but exactly we need read seperately to ensure their independence. Besides, an event such as QOS_CAT_INTPRI_EVENT_ID is not necessary to be read from MSC's register but set to be 0 directly if corresponding feature doesn't support. - Applying configuration When applying downstream or internal priority configuration, given the independence of their two, we should check if feature mpam_feat_ xxpri_part supported first and next check mpam_feat_xxpri_part_0_low, and convert dspri and intpri into a proper value according to it's max width. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 James Morse 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- The MPAM MSC error interrupt tells us how we misconfigured the MSC. We don't expect to to this. If the interrupt fires, print a summary, and mark MPAM as broken. Eventually we will try and cleanly teardown when we see this. Now we can register from a helper mpam_register_device_irq() to register overflow and error interrupt from mpam device, When devices come and go we want to make sure the error irq is enabled. We disable the error irq when cpus are taken offline in case the component remains online even when the associated CPUs are offline. Code of this patch are borrowed from james <james.morse@arm.com>. [Wang ShaoBo: few version adaptation changes] Signed-off-by: NJames Morse <james.morse@arm.com> Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=6d1ceca3eb5953fc16a524c9aad933519aa3f64c Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=81d178c198165fd557431d6879135d2e03ea92c0Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- MPAM spec says, when an MPAMCFG register other than MPAMCFG_INTPARTID is read or written, if the value of MPAMCFG_PART_SEL.INTERNAL is not 1, MPAMF_ESR is set to indicate an intPARTID_Range error. So we should set MPAMCFG_PART_SEL.INTERNAL to 1 before reading MPAMCFG_PRI register. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- If cdp enabled, LxCODE and LxDATA are assigned two different partid each occupies a monitor, but because not all features use cdp mode, for instance MB(Memory Bandwidth), we should make sure this two partid/ monitor be operated simultaneously for display. e.g. +- code stream (partid = 0, monitor = 0) ----+---> L3CODE cpu-+ + +- data stream (partid = 1, monitor = 1) ----+---> L3DATA | +---> MB Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Reading/Writing registers directly for getting or putting configuration is not friendly with expansion and legibility, multiple types of schemata ctrls is supported, of which value should be converted to a proper value based on specific definition and range in corresponding register according to MPAM spec, Using event id instead to indicate which type configuration we want to get looks easier for us. Besides, different hook-events have different setting bound such as bwa_wd for adaptive range conversion when writing configuration, this can be associated with specific event for conversion. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- MPAMCFG_INTPARTID.INTERNAL must be set when narrowing reqpartid to intpartid according to MPAM spec definitions, and this action must be done before writing MPAMCFG_PART_SEL if narrowing implemented. So we plan this work that do narrowing unifiedly when narrowing is supported. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Use an array for storing extend ctrls' max width, on purpose, checking each input value from schemata. Note the useful value of each ctrls' max width is at least 1, 0 means meaningless, and greater than 1 means the choices can be selected. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Store default priority in mpam class structure from reading devices' intpri_wd and dspri_wd. intpri_wd and dspri_wd represent the number of implemented bits in the internal/downstream priority field in MPAMCFG_PRI, when INTPRI_0_IS_LOW /DSPRI_0_IS_LOW is not set, we need to rotate input priority(higher value higher priority) from user space to target priority (higher value lower priority) and this is restricted by implemented bits. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature feature: ARM MPAM support bugzilla: 48265 CVE: NA -------------------------------- Register MPAMCFG_PRI's default value is also used for software default usage after probing resources, two fields hwdef_intpri and hwdef_dspri are placed into mpam_device structure to store the default priority setting. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-