- 13 12月, 2022 9 次提交
-
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-v5.10.140 commit 9fcc4f4066208b3383824ed8f0eba6bf47c23e87 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9fcc4f4066208b3383824ed8f0eba6bf47c23e87 -------------------------------- [ Upstream commit a5612ca1 ] While reading sysctl_devconf_inherit_init_net, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 856c395c ("net: introduce a knob to control whether to inherit devconf config") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-v5.10.140 commit 371a3bcf3144584c511f80e87d4c28ac2c75e9a7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=371a3bcf3144584c511f80e87d4c28ac2c75e9a7 -------------------------------- [ Upstream commit af67508e ] While reading sysctl_fb_tunnels_only_for_init_net, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 79134e6c ("net: do not create fallback tunnels for non-default namespaces") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-v5.10.140 commit 2c7dae6c45112ee7ead62155fed375cb2e7d7cf8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2c7dae6c45112ee7ead62155fed375cb2e7d7cf8 -------------------------------- [ Upstream commit c42b7cdd ] While reading sysctl_net_busy_poll, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 06021292 ("net: add low latency socket poll") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-v5.10.140 commit 613fd026209e6f6ee69afaf45d1bec2a11b09fea category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=613fd026209e6f6ee69afaf45d1bec2a11b09fea -------------------------------- [ Upstream commit 02739545 ] While reading these sysctl variables, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - .sysctl_rmem - .sysctl_rwmem - .sysctl_rmem_offset - .sysctl_wmem_offset - sysctl_tcp_rmem[1, 2] - sysctl_tcp_wmem[1, 2] - sysctl_decnet_rmem[1] - sysctl_decnet_wmem[1] - sysctl_tipc_rmem[1] Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Pablo Neira Ayuso 提交于
stable inclusion from stable-v5.10.140 commit 6301a73bd83d94b9b3eab8581adb04e40fb5f079 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6301a73bd83d94b9b3eab8581adb04e40fb5f079 -------------------------------- [ Upstream commit f323ef3a ] Extend struct nft_data_desc to add a flag field that specifies nft_data_init() is being called for set element data. Use it to disallow jump to implicit chain from set element, only jump to chain via immediate expression is allowed. Fixes: d0e2c7de ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Conflicts: net/netfilter/nf_tables_api.c Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Pablo Neira Ayuso 提交于
stable inclusion from stable-v5.10.140 commit 98827687593b579f20feb60c7ce3ac8a6fbb5f2e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=98827687593b579f20feb60c7ce3ac8a6fbb5f2e -------------------------------- [ Upstream commit 341b6941 ] Instead of parsing the data and then validate that type and length are correct, pass a description of the expected data so it can be validated upfront before parsing it to bail out earlier. This patch adds a new .size field to specify the maximum size of the data area. The .len field is optional and it is used as an input/output field, it provides the specific length of the expected data in the input path. If then .len field is not specified, then obtained length from the netlink attribute is stored. This is required by cmp, bitwise, range and immediate, which provide no netlink attribute that describes the data length. The immediate expression uses the destination register type to infer the expected data type. Relying on opencoded validation of the expected data might lead to subtle bugs as described in 7e6bc1f6 ("netfilter: nf_tables: stricter validation of element data"). Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Pablo Neira Ayuso 提交于
stable inclusion from stable-v5.10.140 commit 2267d38520c4aa74c9110c9ef2f6619aebc8446d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2267d38520c4aa74c9110c9ef2f6619aebc8446d -------------------------------- [ Upstream commit 23f68d46 ] Allow up to 16-byte comparisons with a new cmp fast version. Use two 64-bit words and calculate the mask representing the bits to be compared. Make sure the comparison is 64-bit aligned and avoid out-of-bound memory access on registers. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Florian Westphal 提交于
stable inclusion from stable-v5.10.140 commit 624c30521233e110cf50ba01980a591e045036ae category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=624c30521233e110cf50ba01980a591e045036ae -------------------------------- [ Upstream commit 7997eff8 ] Harshit Mogalapalli says: In ebt_do_table() function dereferencing 'private->hook_entry[hook]' can lead to NULL pointer dereference. [..] Kernel panic: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] [..] RIP: 0010:ebt_do_table+0x1dc/0x1ce0 Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 5c 16 00 00 48 b8 00 00 00 00 00 fc ff df 49 8b 6c df 08 48 8d 7d 2c 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 88 [..] Call Trace: nf_hook_slow+0xb1/0x170 __br_forward+0x289/0x730 maybe_deliver+0x24b/0x380 br_flood+0xc6/0x390 br_dev_xmit+0xa2e/0x12c0 For some reason ebtables rejects blobs that provide entry points that are not supported by the table, but what it should instead reject is the opposite: blobs that DO NOT provide an entry point supported by the table. t->valid_hooks is the bitmask of hooks (input, forward ...) that will see packets. Providing an entry point that is not support is harmless (never called/used), but the inverse isn't: it results in a crash because the ebtables traverser doesn't expect a NULL blob for a location its receiving packets for. Instead of fixing all the individual checks, do what iptables is doing and reject all blobs that differ from the expected hooks. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: NHarshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reported-by: Nsyzkaller <syzkaller@googlegroups.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Hui Su 提交于
stable inclusion from stable-v5.10.140 commit c30c0f720533c6bcab2e16b90d302afb50a61d2a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c30c0f720533c6bcab2e16b90d302afb50a61d2a -------------------------------- commit 0e387249 upstream. since commit 2279f540 ("sched/deadline: Fix priority inheritance with multiple scheduling classes"), we should not keep it here. Signed-off-by: NHui Su <suhui_kernel@163.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Link: https://lore.kernel.org/r/20220107095254.GA49258@localhost.localdomainSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 12 12月, 2022 2 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I65K8D CVE: NA -------------------------------- request_wrapper is used to fix kabi broken for request, it's only for internal use. This patch make sure out-of-tree drivers won't access request_wrapper if request is not managed by block layer. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I65K8D CVE: NA -------------------------------- Before commit f60df4a0 ("blk-mq: fix kabi broken in struct request"), drivers will got cmd address right after request, however, after this commit, drivers will got cmd address after request_wrapper instead, which is bigger than request and will cause compatibility issues. Fix the problem by placing request_wrapper behind cmd, so that the cmd address for drivers will stay the same. Before commit: |request|cmd| After commit: |request|request_wrapper|cmd| With this patch: |request|cmd|request_wrapper| Performance test: arm64 Kunpeng-920 96 core 1) null_blk setup: modprobe null_blk nr_devices=0 && udevadm settle && cd /sys/kernel/config/nullb && mkdir nullb0 && cd nullb0 && echo 0 > completion_nsec && echo 512 > blocksize && echo 0 > home_node && echo 0 > irqmode && echo 1024 > size && echo 0 > memory_backed && echo 2 > queue_mode && echo 4096 > hw_queue_depth && echo 96 > submit_queues && echo 1 > power 2) fio test script: [global] ioengine=libaio direct=1 numjobs=96 iodepth=32 bs=4k rw=randwrite allow_mounted_write=0 time_based runtime=60 group_reporting=1 ioscheduler=none cpus_allowed_policy=split cpus_allowed=0-95 [test] filename=/dev/nullb0 3) iops test result: without this patch: 23.9M with this patch: 24.1M Fixes: f60df4a0 ("blk-mq: fix kabi broken in struct request") Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 07 12月, 2022 22 次提交
-
-
由 Junhao He 提交于
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5KAX7 -------------------------------------------------------------------------- Fixed the issue that the kabi value changed when the HiSilicon PMU driver added the enum variable in "enum cpuhp_state{}". The hisi_pcie_pmu and hisi_cpa_pmu drivers to replace the explicit specify hotplug events with dynamic allocation hotplug events(CPUHP_AP_ONLINE_DYN). The states between *CPUHP_AP_ONLINE_DYN* and *CPUHP_AP_ONLINE_DYN_END* are reserved for the dynamic allocation. Signed-off-by: NJunhao He <hejunhao3@huawei.com> Reviewed-by: NYicong Yang <yangyicong@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jakub Sitnicki 提交于
mainline inclusion from mainline-v6.1-rc6 commit b68777d5 category: bugfix bugzilla: 188056, https://gitee.com/src-openeuler/kernel/issues/I62RNU CVE: CVE-2022-4129 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b68777d54fac21fc833ec26ea1a2a84f975ab035 -------------------------------- sk->sk_user_data has multiple users, which are not compatible with each other. Writers must synchronize by grabbing the sk->sk_callback_lock. l2tp currently fails to grab the lock when modifying the underlying tunnel socket fields. Fix it by adding appropriate locking. We err on the side of safety and grab the sk_callback_lock also inside the sk_destruct callback overridden by l2tp, even though there should be no refs allowing access to the sock at the time when sk_destruct gets called. v4: - serialize write to sk_user_data in l2tp sk_destruct v3: - switch from sock lock to sk_callback_lock - document write-protection for sk_user_data v2: - update Fixes to point to origin of the bug - use real names in Reported/Tested-by tags Cc: Tom Parkin <tparkin@katalix.com> Fixes: 3557baab ("[L2TP]: PPP over L2TP driver core") Reported-by: NHaowei Yan <g1042620637@gmail.com> Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NLu Wei <luwei32@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Qi Zheng 提交于
Offering: HULK mainline inclusion from mainline-v5.19-rc1 commit 3f913fc5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I610B5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3f913fc5f9745613088d3c569778c9813ab9c129 -------------------------------- We expect no warnings to be issued when we specify __GFP_NOWARN, but currently in paths like alloc_pages() and kmalloc(), there are still some warnings printed, fix it. But for some warnings that report usage problems, we don't deal with them. If such warnings are printed, then we should fix the usage problems. Such as the following case: WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); [zhengqi.arch@bytedance.com: v2] Link: https://lkml.kernel.org/r/20220511061951.1114-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20220510113809.80626-1-zhengqi.arch@bytedance.comSigned-off-by: NQi Zheng <zhengqi.arch@bytedance.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflict: mm/internal.h mm/page_alloc.c Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- In order to be consistent with the vSVA technical route of the open source community, it is necessary to revert related patches and bugfixes. In the meantime, some necessary steps need to be taken to avoid kabi change. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit dbb4844d. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 9db83ab7. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 15700dc0. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 4b042357. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 04ba12c4. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit ac16d334. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 20b23b13. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 41e3175a. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 31ed6dc2. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit cbbf4b3a. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kunkun Jiang 提交于
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA -------------------------------- This reverts commit 3afa66c6. Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 187443, https://gitee.com/openeuler/kernel/issues/I5Z7O2 CVE: NA -------------------------------- Include additional files and add new function will cause kabi broken. So move changes to blk-mq.h. bio_issue_as_root_blkg() is needed by blk_cgroup_mergeable(), move it together. It is used by iocost, too, so add blk-mq.h to blk-iocost.c. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-v5.18-rc1 commit 6b2b0459 category: bugfix bugzilla: 187443, https://gitee.com/openeuler/kernel/issues/I5Z7O2 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs?h=v6.0-rc5&id=6b2b04590b51aa4cf395fcd185ce439cab5961dc --------------------------- blk-iocost and iolatency are cgroup aware rq-qos policies but they didn't disable merges across different cgroups. This obviously can lead to accounting and control errors but more importantly to priority inversions - e.g. an IO which belongs to a higher priority cgroup or IO class may end up getting throttled incorrectly because it gets merged to an IO issued from a low priority cgroup. Fix it by adding blk_cgroup_mergeable() which is called from merge paths and rejects cross-cgroup and cross-issue_as_root merges. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: d7067512 ("block: introduce blk-iolatency io controller") Cc: stable@vger.kernel.org # v4.19+ Cc: Josef Bacik <jbacik@fb.com> Link: https://lore.kernel.org/r/Yi/eE/6zFNyWJ+qd@slm.duckdns.orgSigned-off-by: NJens Axboe <axboe@kernel.dk> conflicts: block/blk-merge.c include/linux/blk-cgroup.h Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lorenz Bauer 提交于
stable inclusion from stable-v5.10.135 commit 6d3fad2b44eb9d226a896d1c93909f0fd2e1b9ea category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZWFM Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6d3fad2b44eb9d226a896d1c93909f0fd2e1b9ea -------------------------------- commit 7c32e8f8 upstream. Allow to pass sk_lookup programs to PROG_TEST_RUN. User space provides the full bpf_sk_lookup struct as context. Since the context includes a socket pointer that can't be exposed to user space we define that PROG_TEST_RUN returns the cookie of the selected socket or zero in place of the socket pointer. We don't support testing programs that select a reuseport socket, since this would mean running another (unrelated) BPF program from the sk_lookup test handler. Signed-off-by: NLorenz Bauer <lmb@cloudflare.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.comSigned-off-by: NTianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Marco Elver 提交于
mainline inclusion from mainline-v5.18-rc1 commit 8cb37a59 category: featrue bugzilla: https://gitee.com/openeuler/kernel/issues/I5YQ6Z CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8cb37a5974a48569aab8a1736d21399fddbdbdb2 -------------------------------- The randomize_kstack_offset feature is unconditionally compiled in when the architecture supports it. To add constraints on compiler versions, we require a dedicated Kconfig variable. Therefore, introduce RANDOMIZE_KSTACK_OFFSET. Furthermore, this option is now also configurable by EXPERT kernels: while the feature is supposed to have zero performance overhead when disabled, due to its use of static branches, there are few cases where giving a distribution the option to disable the feature entirely makes sense. For example, in very resource constrained environments, which would never enable the feature to begin with, in which case the additional kernel code size increase would be redundant. Signed-off-by: NMarco Elver <elver@google.com> Reviewed-by: NNathan Chancellor <nathan@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NKees Cook <keescook@chromium.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220131090521.1947110-1-elver@google.comSigned-off-by: NYi Yang <yiyang13@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Nick Desaulniers 提交于
mainline inclusion from mainline-v5.13-rc1 commit 2515dd6c category: featrue bugzilla: https://gitee.com/openeuler/kernel/issues/I5YQ6Z CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2515dd6ce8e545b0b2eece84920048ef9ed846c4 -------------------------------- "o" isn't a common asm() constraint to use; it triggers an assertion in assert-enabled builds of LLVM that it's not recognized when targeting aarch64 (though it appears to fall back to "m"). It's fixed in LLVM 13 now, but there isn't really a good reason to use "o" in particular here. To avoid causing build issues for those using assert-enabled builds of earlier LLVM versions, the constraint needs changing. Instead, if the point is to retain the __builtin_alloca(), make ptr appear to "escape" via being an input to an empty inline asm block. This is preferable anyways, since otherwise this looks like a dead store. While the use of "r" was considered in https://lore.kernel.org/lkml/202104011447.2E7F543@keescook/ it was only tested as an output (which looks like a dead store, and wasn't sufficient). Use "r" as an input constraint instead, which behaves correctly across compilers and architectures. Fixes: 39218ff4 ("stack: Optionally randomize kernel stack offset each syscall") Signed-off-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NKees Cook <keescook@chromium.org> Tested-by: NNathan Chancellor <nathan@kernel.org> Reviewed-by: NNathan Chancellor <nathan@kernel.org> Link: https://reviews.llvm.org/D100412 Link: https://bugs.llvm.org/show_bug.cgi?id=49956 Link: https://lore.kernel.org/r/20210419231741.4084415-1-keescook@chromium.orgSigned-off-by: NYi Yang <yiyang13@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kees Cook 提交于
mainline inclusion from mainline-v5.13-rc1 commit 39218ff4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5YQ6Z CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39218ff4c625dbf2e68224024fe0acaa60bcd51a -------------------------------- This provides the ability for architectures to enable kernel stack base address offset randomization. This feature is controlled by the boot param "randomize_kstack_offset=on/off", with its default value set by CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. This feature is based on the original idea from the last public release of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt All the credit for the original idea goes to the PaX team. Note that the design and implementation of this upstream randomize_kstack_offset feature differs greatly from the RANDKSTACK feature (see below). Reasoning for the feature: This feature aims to make harder the various stack-based attacks that rely on deterministic stack structure. We have had many such attacks in past (just to name few): https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html As Linux kernel stack protections have been constantly improving (vmap-based stack allocation with guard pages, removal of thread_info, STACKLEAK), attackers have had to find new ways for their exploits to work. They have done so, continuing to rely on the kernel's stack determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT were not relevant. For example, the following recent attacks would have been hampered if the stack offset was non-deterministic between syscalls: https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf (page 70: targeting the pt_regs copy with linear stack overflow) https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html (leaked stack address from one syscall as a target during next syscall) The main idea is that since the stack offset is randomized on each system call, it is harder for an attack to reliably land in any particular place on the thread stack, even with address exposures, as the stack base will change on the next syscall. Also, since randomization is performed after placing pt_regs, the ptrace-based approach[1] to discover the randomized offset during a long-running syscall should not be possible. Design description: During most of the kernel's execution, it runs on the "thread stack", which is pretty deterministic in its structure: it is fixed in size, and on every entry from userspace to kernel on a syscall the thread stack starts construction from an address fetched from the per-cpu cpu_current_top_of_stack variable. The first element to be pushed to the thread stack is the pt_regs struct that stores all required CPU registers and syscall parameters. Finally the specific syscall function is called, with the stack being used as the kernel executes the resulting request. The goal of randomize_kstack_offset feature is to add a random offset after the pt_regs has been pushed to the stack and before the rest of the thread stack is used during the syscall processing, and to change it every time a process issues a syscall. The source of randomness is currently architecture-defined (but x86 is using the low byte of rdtsc()). Future improvements for different entropy sources is possible, but out of scope for this patch. Further more, to add more unpredictability, new offsets are chosen at the end of syscalls (the timing of which should be less easy to measure from userspace than at syscall entry time), and stored in a per-CPU variable, so that the life of the value does not stay explicitly tied to a single task. As suggested by Andy Lutomirski, the offset is added using alloca() and an empty asm() statement with an output constraint, since it avoids changes to assembly syscall entry code, to the unwinder, and provides correct stack alignment as defined by the compiler. In order to make this available by default with zero performance impact for those that don't want it, it is boot-time selectable with static branches. This way, if the overhead is not wanted, it can just be left turned off with no performance impact. The generated assembly for x86_64 with GCC looks like this: ... ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax # 12380 <kstack_offset> ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax ffffffff81003983: 48 83 c0 0f add $0xf,%rax ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax ffffffff8100398c: 48 29 c4 sub %rax,%rsp ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax ... As a result of the above stack alignment, this patch introduces about 5 bits of randomness after pt_regs is spilled to the thread stack on x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for stack alignment). The amount of entropy could be adjusted based on how much of the stack space we wish to trade for security. My measure of syscall performance overhead (on x86_64): lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null randomize_kstack_offset=y Simple syscall: 0.7082 microseconds randomize_kstack_offset=n Simple syscall: 0.7016 microseconds So, roughly 0.9% overhead growth for a no-op syscall, which is very manageable. And for people that don't want this, it's off by default. There are two gotchas with using the alloca() trick. First, compilers that have Stack Clash protection (-fstack-clash-protection) enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to any dynamic stack allocations. While the randomization offset is always less than a page, the resulting assembly would still contain (unreachable!) probing routines, bloating the resulting assembly. To avoid this, -fno-stack-clash-protection is unconditionally added to the kernel Makefile since this is the only dynamic stack allocation in the kernel (now that VLAs have been removed) and it is provably safe from Stack Clash style attacks. The second gotcha with alloca() is a negative interaction with -fstack-protector*, in that it sees the alloca() as an array allocation, which triggers the unconditional addition of the stack canary function pre/post-amble which slows down syscalls regardless of the static branch. In order to avoid adding this unneeded check and its associated performance impact, architectures need to carefully remove uses of -fstack-protector-strong (or -fstack-protector) in the compilation units that use the add_random_kstack() macro and to audit the resulting stack mitigation coverage (to make sure no desired coverage disappears). No change is visible for this on x86 because the stack protector is already unconditionally disabled for the compilation unit, but the change is required on arm64. There is, unfortunately, no attribute that can be used to disable stack protector for specific functions. Comparison to PaX RANDKSTACK feature: The RANDKSTACK feature randomizes the location of the stack start (cpu_current_top_of_stack), i.e. including the location of pt_regs structure itself on the stack. Initially this patch followed the same approach, but during the recent discussions[2], it has been determined to be of a little value since, if ptrace functionality is available for an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write different offsets in the pt_regs struct, observe the cache behavior of the pt_regs accesses, and figure out the random stack offset. Another difference is that the random offset is stored in a per-cpu variable, rather than having it be per-thread. As a result, these implementations differ a fair bit in their implementation details and results, though obviously the intent is similar. [1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/ [2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/ [3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.htmlCo-developed-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org conflict: Documentation/admin-guide/kernel-parameters.txt arch/Kconfig Signed-off-by: NYi Yang <yiyang13@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kees Cook 提交于
mainline inclusion from mainline-v5.13-rc1 commit 0d66ccc1 category: featrue bugzilla: https://gitee.com/openeuler/kernel/issues/I5YQ6Z CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d66ccc1627013c95f1e7ef10b95b8451cd7834e -------------------------------- As shown in the comment in jump_label.h, choosing the initial state of static branches changes the assembly layout. If the condition is expected to be likely it's inline, and if unlikely it is out of line via a jump. A few places in the kernel use (or could be using) a CONFIG to choose the default state, which would give a small performance benefit to their compile-time declared default. Provide the infrastructure to do this. Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210401232347.2791257-2-keescook@chromium.orgSigned-off-by: NYi Yang <yiyang13@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 29 11月, 2022 7 次提交
-
-
由 Huacai Chen 提交于
LoongArch inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OHOB -------------------------------- cc27b735 ("PCI/portdrv: Turn off PCIe services during shutdown") causes poweroff/reboot failure on systems with LS7A chipset. We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER" in do_pci_disable _device(), it can work well. The hardware engineer says that the root cause is that CPU is still accessing PCIe devices while poweroff/reboot, and if we disable the Bus Master Bit at this time, the PCIe controller doesn't forward requests to downstream devices, and also does not send TIMEOUT to CPU, which causes CPU wait forever (hardware deadlock). This behavior is a PCIe protocol violation (Bus Master should not be involved in CPU MMIO transactions), and it will be fixed in new revisions of hardware (add timeout mechanism for CPU read request, whether or not Bus Master bit is cleared). On some x86 platforms, radeon/amdgpu devices can cause similar problems [1][2]. Once before I wanted to make a single patch to solve "all of these problems" together, but it seems unreasonable because maybe they are not exactly the same problem. So, this patch add a new function pcie_portdrv_shutdown(), a slight modified copy of pcie_portdrv_remove() dedicated for the shutdown path, and then add a quirk just for LS7A to avoid clearing Bus Master bit in pcie_portdrv_shutdown(). Leave other platforms behave as before. [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980 [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>
-
由 Huacai Chen 提交于
LoongArch inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OHOB -------------------------------- In new revision of LS7A, some PCIe ports support larger value than 256, but their maximum supported MRRS values are not detectable. Moreover, the current loongson_mrrs_quirk() cannot avoid devices increasing its MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169) will actually set a big value in its driver. So the only possible way is configure MRRS of all devices in BIOS, and add a pci host bridge bit flag (i.e., no_inc_mrrs) to stop the increasing MRRS operations. However, according to PCIe Spec, it is legal for an OS to program any value for MRRS, and it is also legal for an endpoint to generate a Read Request with any size up to its MRRS. As the hardware engineers say, the root cause here is LS7A doesn't break up large read requests. In detail, LS7A PCIe port reports CA (Completer Abort) if it receives a Memory Read request with a size that's "too big" ("too big" means larger than the PCIe ports can handle, which means 256 for some ports and 4096 for the others, and of course this is a problem in the LS7A's hardware design). Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>
-
由 Huacai Chen 提交于
mainline inclusion from mainline-v6.0-rc1 commit cd89edda category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OHOB CVE: NA -------------------------------- Loongson PCH (LS7A chipset) will be used by both MIPS-based and LoongArch- based Loongson processors. MIPS-based Loongson uses FDT, while LoongArch- based Loongson uses ACPI. Add ACPI init support for the driver in pci-loongson.c because it is currently FDT-only. LoongArch is a new RISC ISA, mainline support will come soon, and documentations are here (in translation): https://github.com/loongson/LoongArch-Documentation Link: https://lore.kernel.org/r/20220714124216.1489304-4-chenhuacai@loongson.cnSigned-off-by: NHuacai Chen <chenhuacai@loongson.cn> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Jianmin Lv 提交于
mainline inclusion from mainline-v6.0-rc1 commit e8bba72b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OHOB CVE: NA -------------------------------- For LoongArch, ACPI_IRQ_MODEL_LPIC is introduced, and then the callback acpi_get_gsi_domain_id and acpi_gsi_to_irq_fallback are implemented. The acpi_get_gsi_domain_id callback returns related fwnode handle of irqdomain for different GSI range. The acpi_gsi_to_irq_fallback will create new mapping for gsi when the mapping of it is not found. Signed-off-by: NJianmin Lv <lvjianmin@loongson.cn> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1658314292-35346-14-git-send-email-lvjianmin@loongson.cn
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v6.0-rc1 commit 744b9a0c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OHOB CVE: NA -------------------------------- It appears that the generic version of acpi_gsi_to_irq() doesn't fallback to establishing a mapping if there is no pre-existing one while the x86 version does. While arm64 seems unaffected by it, LoongArch is relying on the x86 behaviour. In an effort to prevent new architectures from reinventing the proverbial wheel, provide an optional callback that the arch code can set to restore the x86 behaviour. Hopefully we can eventually get rid of this in the future once the expected behaviour has been clarified. Reported-by: NJianmin Lv <lvjianmin@loongson.cn> Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NJianmin Lv <lvjianmin@loongson.cn> Tested-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/1658314292-35346-4-git-send-email-lvjianmin@loongson.cn Change-Id: I38ccf8bef562439fd434b37de7da4acb4e07e3d9
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v6.0-rc1 commit 7327b16f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OHOB CVE: NA -------------------------------- In an unfortunate departure from the ACPI spec, the LoongArch architecture split its GSI space across multiple interrupt controllers. In order to be able to reuse the core code and prevent architectures from reinventing an already square wheel, offer the arch code the ability to register a dispatcher function that will return the domain fwnode for a given GSI. The ARM GIC drivers are updated to support this (with a single domain, as intended). Signed-off-by: NMarc Zyngier <maz@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NJianmin Lv <lvjianmin@loongson.cn> Tested-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/1658314292-35346-3-git-send-email-lvjianmin@loongson.cn
-
由 Huacai Chen 提交于
mainline inclusion from mainline-v5.19-rc1 commit 439057ec category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OHOB CVE: NA -------------------------------- LoongArch maintains cache coherency in hardware, but its WUC attribute (Weak-ordered UnCached, which is similar to WC) is out of the scope of cache coherency machanism. This means WUC can only used for write-only memory regions. Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Reviewed-by: NWANG Xuerui <git@xen0n.name> Reviewed-by: NJiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>
-