- 10 5月, 2022 40 次提交
-
-
由 Guan Jing 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I52611 CVE: NA Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guan Jing 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I52611 CVE: NA -------------------------------- There are two caces that we add tracepoint: a) while online task of sibling cpu is running, it is running that offline task of local cpu will be set TIF_NEED_RESCHED; b) while online task of sibling cpu is running, it will expell that next picked offline task of local cpu. Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guan Jing 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I52611 CVE: NA -------------------------------- We have added two statistics for qos smt expeller: a) nr_qos_smt_send_ipi:the times of ipi which online task expel offline tasks; b) nr_qos_smt_expelled:the statistics that offline task will not be picked times. Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guan Jing 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I52611 CVE: NA -------------------------------- We implement the function of qos smt expeller by this following two points: a)when online tasks and offline tasks are running on the same physical cpu, online tasks will send ipi to expel offline tasks on the smt sibling cpus. b)when online tasks are running, the smt sibling cpus will not allow offline tasks to be selected. Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guan Jing 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I52611 CVE: NA -------------------------------- We introduce the qos smt expeller, which lets online tasks to expel offline tasks on the smt sibling cpus, and exclusively occupy CPU resources.In this way we are able to improve QOS of online tasks in co-location. Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ye Bin 提交于
mainline inclusion from mainline-v5.18-rc4 commit a2b0b205 category: bugfix bugzilla: 186450, https://gitee.com/openeuler/kernel/issues/I4YSJ7 CVE: NA ----------------------------------------------- We got issue as follows: [home]# fsck.ext4 -fn ram0yb e2fsck 1.45.6 (20-Mar-2020) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Symlink /p3/d14/d1a/l3d (inode #3494) is invalid. Clear? no Entry 'l3d' in /p3/d14/d1a (3383) has an incorrect filetype (was 7, should be 0). Fix? no As the symlink file size does not match the file content. If the writeback of the symlink data block failed, ext4_finish_bio() handles the end of IO. However this function fails to mark the buffer with BH_write_io_error and so when unmount does journal checkpoint it cannot detect the writeback error and will cleanup the journal. Thus we've lost the correct data in the journal area. To solve this issue, mark the buffer as BH_write_io_error in ext4_finish_bio(). Cc: stable@kernel.org Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220321144438.201685-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NChenXiaoSong <chenxiaosong2@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ye Bin 提交于
mainline inclusion from mainline-v5.18-rc4 commit b98535d0 category: bugfix bugzilla: 186675, https://gitee.com/openeuler/kernel/issues/I55TUC CVE: NA ------------------------------------------------- We got issue as follows: ------------[ cut here ]------------ kernel BUG at fs/jbd2/transaction.c:389! invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 9 PID: 131 Comm: kworker/9:1 Not tainted 5.17.0-862.14.0.6.x86_64-00001-g23f87daf7d74-dirty #197 Workqueue: events flush_stashed_error_work RIP: 0010:start_this_handle+0x41c/0x1160 RSP: 0018:ffff888106b47c20 EFLAGS: 00010202 RAX: ffffed10251b8400 RBX: ffff888128dc204c RCX: ffffffffb52972ac RDX: 0000000000000200 RSI: 0000000000000004 RDI: ffff888128dc2050 RBP: 0000000000000039 R08: 0000000000000001 R09: ffffed10251b840a R10: ffff888128dc204f R11: ffffed10251b8409 R12: ffff888116d78000 R13: 0000000000000000 R14: dffffc0000000000 R15: ffff888128dc2000 FS: 0000000000000000(0000) GS:ffff88839d680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000001620068 CR3: 0000000376c0e000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> jbd2__journal_start+0x38a/0x790 jbd2_journal_start+0x19/0x20 flush_stashed_error_work+0x110/0x2b3 process_one_work+0x688/0x1080 worker_thread+0x8b/0xc50 kthread+0x26f/0x310 ret_from_fork+0x22/0x30 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Above issue may happen as follows: umount read procfs error_work ext4_put_super flush_work(&sbi->s_error_work); ext4_mb_seq_groups_show ext4_mb_load_buddy_gfp ext4_mb_init_group ext4_mb_init_cache ext4_read_block_bitmap_nowait ext4_validate_block_bitmap ext4_error ext4_handle_error schedule_work(&EXT4_SB(sb)->s_error_work); ext4_unregister_sysfs(sb); jbd2_journal_destroy(sbi->s_journal); journal_kill_thread journal->j_flags |= JBD2_UNMOUNT; flush_stashed_error_work jbd2_journal_start start_this_handle BUG_ON(journal->j_flags & JBD2_UNMOUNT); To solve this issue, we call 'ext4_unregister_sysfs() before flushing s_error_work in ext4_put_super(). Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/20220322012419.725457-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> conflicts: fs/ext4/super.c Signed-off-by: NChenXiaoSong <chenxiaosong2@huawei.com> Reviewed-by: Nyebin <yebin10@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ye Bin 提交于
mainline inclusion from mainline-v5.18-rc4 commit c186f088 category: bugfix bugzilla: 186477, https://gitee.com/openeuler/kernel/issues/I55UHT CVE: NA ------------------------------------------------- We got issue as follows: EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue ================================================================== BUG: KASAN: use-after-free in ext4_search_dir fs/ext4/namei.c:1394 [inline] BUG: KASAN: use-after-free in search_dirblock fs/ext4/namei.c:1199 [inline] BUG: KASAN: use-after-free in __ext4_find_entry+0xdca/0x1210 fs/ext4/namei.c:1553 Read of size 1 at addr ffff8881317c3005 by task syz-executor117/2331 CPU: 1 PID: 2331 Comm: syz-executor117 Not tainted 5.10.0+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:83 [inline] dump_stack+0x144/0x187 lib/dump_stack.c:124 print_address_description+0x7d/0x630 mm/kasan/report.c:387 __kasan_report+0x132/0x190 mm/kasan/report.c:547 kasan_report+0x47/0x60 mm/kasan/report.c:564 ext4_search_dir fs/ext4/namei.c:1394 [inline] search_dirblock fs/ext4/namei.c:1199 [inline] __ext4_find_entry+0xdca/0x1210 fs/ext4/namei.c:1553 ext4_lookup_entry fs/ext4/namei.c:1622 [inline] ext4_lookup+0xb8/0x3a0 fs/ext4/namei.c:1690 __lookup_hash+0xc5/0x190 fs/namei.c:1451 do_rmdir+0x19e/0x310 fs/namei.c:3760 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x445e59 Code: 4d c7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b c7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff2277fac8 EFLAGS: 00000246 ORIG_RAX: 0000000000000054 RAX: ffffffffffffffda RBX: 0000000000400280 RCX: 0000000000445e59 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000200000c0 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000002 R10: 00007fff2277f990 R11: 0000000000000246 R12: 0000000000000000 R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000 The buggy address belongs to the page: page:0000000048cd3304 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x1317c3 flags: 0x200000000000000() raw: 0200000000000000 ffffea0004526588 ffffea0004528088 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8881317c2f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8881317c2f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8881317c3000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff8881317c3080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8881317c3100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== ext4_search_dir: ... de = (struct ext4_dir_entry_2 *)search_buf; dlimit = search_buf + buf_size; while ((char *) de < dlimit) { ... if ((char *) de + de->name_len <= dlimit && ext4_match(dir, fname, de)) { ... } ... de_len = ext4_rec_len_from_disk(de->rec_len, dir->i_sb->s_blocksize); if (de_len <= 0) return -1; offset += de_len; de = (struct ext4_dir_entry_2 *) ((char *) de + de_len); } Assume: de=0xffff8881317c2fff dlimit=0x0xffff8881317c3000 If read 'de->name_len' which address is 0xffff8881317c3005, obviously is out of range, then will trigger use-after-free. To solve this issue, 'dlimit' must reserve 8 bytes, as we will read 'de->name_len' to judge if '(char *) de + de->name_len' out of range. Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220324064816.1209985-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: NChenXiaoSong <chenxiaosong2@huawei.com> Reviewed-by: Nyebin <yebin10@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Janis Schoetterl-Glausch 提交于
stable inclusion from stable-v5.10.100 commit b62267b8b06e9b8bb429ae8f962ee431e6535d60 bugzilla: https://gitee.com/src-openeuler/kernel/issues/I4U746 CVE: CVE-2022-0516 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b62267b8b06e9b8bb429ae8f962ee431e6535d60 -------------------------------- commit 2c212e1b upstream. Refuse SIDA memops on guests which are not protected. For normal guests, the secure instruction data address designation, which determines the location we access, is not under control of KVM. Fixes: 19e12277 (KVM: S390: protvirt: Introduce instruction data area bounce buffer) Signed-off-by: NJanis Schoetterl-Glausch <scgl@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: NChristian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: 185955, https://gitee.com/openeuler/kernel/issues/I55AKK CVE: NA backport: openEuler-22.03-LTS -------------------------------- Commit 505a666e ("writeback: plug writeback in wb_writeback() and writeback_inodes_wb()") has us holding a plug during wb_writeback, which may cause a potential ABBA dead lock: wb_writeback fat_file_fsync blk_start_plug(&plug) for (;;) { iter i-1: some reqs have been added into plug->mq_list // LOCK A iter i: progress = __writeback_inodes_wb(wb, work) . writeback_sb_inodes // fat's bdev . __writeback_single_inode . . generic_writepages . . __block_write_full_page . . . . __generic_file_fsync . . . . sync_inode_metadata . . . . writeback_single_inode . . . . __writeback_single_inode . . . . fat_write_inode . . . . __fat_write_inode . . . . sync_dirty_buffer // fat's bdev . . . . lock_buffer(bh) // LOCK B . . . . submit_bh . . . . blk_mq_get_tag // LOCK A . . . trylock_buffer(bh) // LOCK B . . . redirty_page_for_writepage . . . wbc->pages_skipped++ . . --wbc->nr_to_write . wrote += write_chunk - wbc.nr_to_write // wrote > 0 . requeue_inode . redirty_tail_locked if (progress) // progress > 0 continue; iter i+1: queue_io // similar process with iter i, infinite for-loop ! } blk_finish_plug(&plug) // flush plug won't be called Above process triggers a hungtask like: [ 399.044861] INFO: task bb:2607 blocked for more than 30 seconds. [ 399.046824] Not tainted 5.18.0-rc1-00005-gefae4d9eb6a2-dirty [ 399.051539] task:bb state:D stack: 0 pid: 2607 ppid: 2426 flags:0x00004000 [ 399.051556] Call Trace: [ 399.051570] __schedule+0x480/0x1050 [ 399.051592] schedule+0x92/0x1a0 [ 399.051602] io_schedule+0x22/0x50 [ 399.051613] blk_mq_get_tag+0x1d3/0x3c0 [ 399.051640] __blk_mq_alloc_requests+0x21d/0x3f0 [ 399.051657] blk_mq_submit_bio+0x68d/0xca0 [ 399.051674] __submit_bio+0x1b5/0x2d0 [ 399.051708] submit_bio_noacct+0x34e/0x720 [ 399.051718] submit_bio+0x3b/0x150 [ 399.051725] submit_bh_wbc+0x161/0x230 [ 399.051734] __sync_dirty_buffer+0xd1/0x420 [ 399.051744] sync_dirty_buffer+0x17/0x20 [ 399.051750] __fat_write_inode+0x289/0x310 [ 399.051766] fat_write_inode+0x2a/0xa0 [ 399.051783] __writeback_single_inode+0x53c/0x6f0 [ 399.051795] writeback_single_inode+0x145/0x200 [ 399.051803] sync_inode_metadata+0x45/0x70 [ 399.051856] __generic_file_fsync+0xa3/0x150 [ 399.051880] fat_file_fsync+0x1d/0x80 [ 399.051895] vfs_fsync_range+0x40/0xb0 [ 399.051929] __x64_sys_fsync+0x18/0x30 In my test, 'need_resched()' (which is imported by 590dca3a "fs-writeback: unplug before cond_resched in writeback_sb_inodes") in function 'writeback_sb_inodes()' seldom comes true, unless cond_resched() is deleted from write_cache_pages(). Fix it by correcting wrote number according number of skipped pages in writeback_sb_inodes(). Goto Link to find a reproducer. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215837 Cc: stable@vger.kernel.org # v4.3 Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: 185955, https://gitee.com/openeuler/kernel/issues/I50DVI?from=project-issue -------------------------------- There at least 6 PEBs reserved on UBI device: 1. EBA_RESERVED_PEBS[1] 2. WL_RESERVED_PEBS[1] 3. UBI_LAYOUT_VOLUME_EBS[2] 4. MIN_FASTMAP_RESERVED_PEBS[2] When all ubi volumes take all their PEBs, there are 3 (EBA_RESERVED_PEBS + WL_RESERVED_PEBS + MIN_FASTMAP_RESERVED_PEBS - MIN_FASTMAP_TAKEN_PEBS[1]) free PEBs. Since f9c34bb5 ("ubi: Fix producing anchor PEBs") and 4b68bf9a ("ubi: Select fastmap anchor PEBs considering wear level rules") applied, there is only 1 (3 - FASTMAP_ANCHOR_PEBS[1] - FASTMAP_NEXT_ANCHOR_PEBS[1]) free PEB to fill pool and wl_pool, after filling pool, wl_pool is always empty. So, UBI could be stuck in an infinite loop: ubi_thread system_wq wear_leveling_worker <-------------------------------------------------- get_peb_for_wl | // fm_wl_pool, used = size = 0 | schedule_work(&ubi->fm_work) | | update_fastmap_work_fn | ubi_update_fastmap | ubi_refill_pools | // ubi->free_count - ubi->beb_rsvd_pebs < 5 | // wl_pool is not filled with any PEBs | schedule_erase(old_fm_anchor) | ubi_ensure_anchor_pebs | __schedule_ubi_work(wear_leveling_worker) | | __erase_worker | ensure_wear_leveling | __schedule_ubi_work(wear_leveling_worker) -------------------------- , which cause high cpu usage of ubi_bgt: top - 12:10:42 up 5 min, 2 users, load average: 1.76, 0.68, 0.27 Tasks: 123 total, 3 running, 54 sleeping, 0 stopped, 0 zombie PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1589 root 20 0 0 0 0 R 45.0 0.0 0:38.86 ubi_bgt0d 319 root 20 0 0 0 0 I 15.2 0.0 0:15.29 kworker/0:3-eve 371 root 20 0 0 0 0 I 14.9 0.0 0:12.85 kworker/3:3-eve 20 root 20 0 0 0 0 I 11.3 0.0 0:05.33 kworker/1:0-eve 202 root 20 0 0 0 0 I 11.3 0.0 0:04.93 kworker/2:3-eve In 4b68bf9a ("ubi: Select fastmap anchor PEBs considering wear level rules"), there are three key changes: 1) Choose the fastmap anchor when the most free PEBs are available. 2) Enable anchor move within the anchor area again as it is useful for distributing wear. 3) Import a candidate fm anchor and check this PEB's erase count during wear leveling. If the wear leveling limit is exceeded, use the used anchor area PEB with the lowest erase count to replace it. The anchor candidate can be removed, we can check fm_anchor PEB's erase count during wear leveling. Fix it by: 1) Removing 'fm_next_anchor' and check 'fm_anchor' during wear leveling. 2) Preferentially filling one free peb into fm_wl_pool in condition of ubi->free_count > ubi->beb_rsvd_pebs, then try to reserve enough free count for fastmap non anchor pebs after the above prerequisites are met. Then, there are at least 1 PEB in pool and 1 PEB in wl_pool after calling ubi_refill_pools() with all erase works done. Fetch a reproducer in [Link]. Fixes: 4b68bf9a ("ubi: Select fastmap anchor PEBs ... rules") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215407Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> v1->v2: Update fm pool filling strategy, consider reserve enough free count for fastmap non anchor pebs while filling fm_wl_pool. v2->v3: Remove 'fm_next_anchor' and check 'fm_anchor' during wear leveling. v3->v4: Reserve 'fm_next_anchor' member in 'ubi_device' to keep kabi no changes. Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Jihong 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53L83 CVE: NA Reference: https://lore.kernel.org/all/20210104020930.GA4897@leoy-ThinkPad-X240s/ ------------------- Since the new display option 'all' is introduced, this patch is to update the documentation to reflect it. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Signed-off-by: NYang Jihong <yangjihong1@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Jihong 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53L83 CVE: NA Reference: https://lore.kernel.org/all/20210104020930.GA4897@leoy-ThinkPad-X240s/ ------------------- Except the existed three display options 'tot', 'rmt', 'lcl', this patch adds option 'all' so can sort on the all cache hit for load operation. This new introduced option can be a choice for profiling cache false sharing if the memory event doesn't contain HITM tags. For displaying with option 'all', the "Shared Data Cache Line Table" and "Shared Cache Line Distribution Pareto" both have difference comparing to other three display options. For the "Shared Data Cache Line Table", instead of sorting HITM metrics, it sorts with the metrics "tot_ld_hit" and "percent_tot_ld_hit". If without HITM metrics, users can analyze the load hit statistics for all cache levels, so the dimensions of total load hit is used to replace HITM dimensions. For Pareto, every single cache line shows the metrics "cl_tot_ld_hit" and "cl_tot_ld_miss" instead of "cl_rmt_hitm" and "percent_lcl_hitm", and the single cache line view is sorted by metrics "tot_ld_hit". As result, we can get the 'all' display as follows: # perf c2c report -d all --coalesce tid,pid,iaddr,dso --stdio [...] ================================================= Shared Data Cache Line Table ================================================= # # ----------- Cacheline ---------- Load Hit Load Hit Total Total Total ---- Stores ---- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ---- # Index Address Node PA cnt Pct Total records Loads Stores L1Hit L1Miss FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt # ..... .................. .... ...... ........ ........ ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........ # 0 0x556f25dff100 0 1895 75.73% 4591 7840 4591 3249 2633 616 849 2734 67 58 883 0 0 0 0 1 0x556f25dff080 0 1 13.10% 794 794 794 0 0 0 164 486 28 20 96 0 0 0 0 2 0x556f25dff0c0 0 1 10.01% 607 607 607 0 0 0 107 5 5 488 2 0 0 0 0 ================================================= Shared Cache Line Distribution Pareto ================================================= # # -- Load Refs -- -- Store Refs -- --------- Data address --------- ---------- cycles ---------- Total cpu Shared # Num Hit Miss L1 Hit L1 Miss Offset Node PA cnt Pid Tid Code address rmt hitm lcl hitm load records cnt Symbol Object Source:Line Node # ..... ....... ....... ....... ....... .................. .... ...... ....... .................. .................. ........ ........ ........ ....... ........ ................... ................. ........................... .... # ------------------------------------------------------------- 0 4591 0 2633 616 0x556f25dff100 ------------------------------------------------------------- 20.52% 0.00% 0.00% 0.00% 0x0 0 1 28079 28082:lock_th 0x556f25bfdc1d 0 2200 1276 942 1 [.] read_write_func false_sharing.exe false_sharing_example.c:146 0 19.82% 0.00% 38.06% 0.00% 0x0 0 1 28079 28082:lock_th 0x556f25bfdc16 0 2190 1130 1912 1 [.] read_write_func false_sharing.exe false_sharing_example.c:145 0 18.25% 0.00% 56.63% 0.00% 0x0 0 1 28079 28081:lock_th 0x556f25bfdc16 0 2173 1074 2329 1 [.] read_write_func false_sharing.exe false_sharing_example.c:145 0 18.23% 0.00% 0.00% 0.00% 0x0 0 1 28079 28081:lock_th 0x556f25bfdc1d 0 2013 1220 837 1 [.] read_write_func false_sharing.exe false_sharing_example.c:146 0 0.00% 0.00% 3.11% 59.90% 0x0 0 1 28079 28081:lock_th 0x556f25bfdc28 0 0 0 451 1 [.] read_write_func false_sharing.exe false_sharing_example.c:146 0 0.00% 0.00% 2.20% 40.10% 0x0 0 1 28079 28082:lock_th 0x556f25bfdc28 0 0 0 305 1 [.] read_write_func false_sharing.exe false_sharing_example.c:146 0 12.00% 0.00% 0.00% 0.00% 0x20 0 1 28079 28083:reader_thd 0x556f25bfdc73 0 159 107 551 1 [.] read_write_func false_sharing.exe false_sharing_example.c:155 0 11.17% 0.00% 0.00% 0.00% 0x20 0 1 28079 28084:reader_thd 0x556f25bfdc73 0 148 108 513 1 [.] read_write_func false_sharing.exe false_sharing_example.c:155 0 [...] Signed-off-by: NLeo Yan <leo.yan@linaro.org> conflict: tools/perf/builtin-c2c.c Signed-off-by: NYang Jihong <yangjihong1@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Jihong 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53L83 CVE: NA Reference: https://lore.kernel.org/all/20210104020930.GA4897@leoy-ThinkPad-X240s/ ------------------- The node header array contains 3 items, each item is used for one of the 3 flavors for node accessing info. To extend sorting on all load references and not always stick to HITMs, the second header string "Node{cpus %hitms %stores}" should be adjusted (e.g. it's changed as "Node{cpus %loads %stores}"). For this reason, this patch changes the node header array to three flat variables and uses switch-case in function setup_nodes_header(), thus it is easier for altering the header string. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Signed-off-by: NYang Jihong <yangjihong1@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Jihong 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53L83 CVE: NA Reference: https://lore.kernel.org/all/20210104020930.GA4897@leoy-ThinkPad-X240s/ ------------------- Add dimensions for load miss and its percentage calculation, which is to be displayed in the single cache line output. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Signed-off-by: NYang Jihong <yangjihong1@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Jihong 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53L83 CVE: NA Reference: https://lore.kernel.org/all/20210104020930.GA4897@leoy-ThinkPad-X240s/ ------------------- Add dimensions for load hit and its percentage calculation, which is to be displayed in the single cache line output. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Signed-off-by: NYang Jihong <yangjihong1@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Jihong 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53L83 CVE: NA Reference: https://lore.kernel.org/all/20210104020930.GA4897@leoy-ThinkPad-X240s/ ------------------- Arm SPE trace data doesn't support HITM, but we still want to explore "perf c2c" tool to analyze cache false sharing. If without HITM tag, the tool cannot give out accurate result for cache false sharing, a candidate solution is to sort the total load operations and connect with the threads info, e.g. if multiple threads hit the same cache line for many times, this can give out the hint that it's likely to cause cache false sharing issue. Unlike having HITM tag, the proposed solution is not accurate and might introduce false positive reporting, but it's a pragmatic approach for detecting false sharing if memory event doesn't support HITM. To sort with the cache line hit, this patch adds dimensions for total load hit and the associated percentage calculation. Signed-off-by: NLeo Yan <leo.yan@linaro.org> Signed-off-by: NYang Jihong <yangjihong1@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Thomas Gleixner 提交于
mainline inclusion from mainline-v5.17-rc1 commit 24ee940d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I53K0E CVE: NA ------------------------------------------------------------------------- While reporting a quiescent state for a given CPU, rcu_core() takes advantage of the freshly loaded grace period sequence number and the locked rnp to accelerate the callbacks whose sequence number have been assigned a stale value. This action is only necessary when the rdp isn't offloaded, otherwise the NOCB kthreads already take care of the callbacks progression. However the check for the offloaded state is volatile because it is performed outside the IRQs disabled section. It's possible for the offloading process to preempt rcu_core() at that point on PREEMPT_RT. This is dangerous because rcu_core() may end up accelerating callbacks concurrently with NOCB kthreads without appropriate locking. Fix this with moving the offloaded check inside the rnp locking section. Reported-and-tested-by: NValentin Schneider <valentin.schneider@arm.com> Reviewed-by: NValentin Schneider <valentin.schneider@arm.com> Tested-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NFrederic Weisbecker <frederic@kernel.org> Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Conflicts: kernel/rcu/tree.c Move "const bool offloaded = ..." down, so that it is within the irq disabled protection range, and with minimal changes. Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZII6 -------------------------------- Kernel panic happened on 'arm64 big endian' board after calling function that has been live-patched. It can be reproduced as follows: 1. Insert 'livepatch-sample.ko' to patch function 'cmdline_proc_show'; 2. Enable patch by execute: echo 1 > /sys/kernel/livepatch/livepatch-sample/enabled 3. Call 'cmdline_proc_show' by execute: cat /proc/cmdline 4. Then we get following panic logs: > kernel BUG at arch/arm64/kernel/traps.c:408! > Internal error: Oops - BUG: 0 [#1] SMP > Modules linked in: dump_mem(OE) livepatch_cmdline1(OEK) > [last unloaded: dump_mem] > CPU: 3 PID: 1752 Comm: cat Session: 0 Tainted: G OE K > 5.10.0+ #2 > Hardware name: Hisilicon PhosphorHi1382 (DT) > pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--) > pc : do_undefinstr+0x23c/0x2b4 > lr : do_undefinstr+0x5c/0x2b4 > sp : ffffffc010ac3a80 > x29: ffffffc010ac3a80 x28: ffffff82eb0a8000 > x27: 0000000000000000 x26: 0000000000000001 > x25: 0000000000000000 x24: 0000000000001000 > x23: 0000000000000000 x22: ffffffd0e0f16000 > x21: ffffffd0e0ae7000 x20: ffffffc010ac3b00 > x19: 0000000000021fd6 x18: ffffffd0e04aad94 > x17: 0000000000000000 x16: 0000000000000000 > x15: ffffffd0e04b519c x14: 0000000000000000 > x13: 0000000000000000 x12: 0000000000000000 > x11: 0000000000000000 x10: 0000000000000000 > x9 : 0000000000000000 x8 : 0000000000000000 > x7 : 0000000000000000 x6 : ffffffd0e0f16100 > x5 : 0000000000000000 x4 : 00000000d5300000 > x3 : 0000000000000000 x2 : ffffffd0e0f160f0 > x1 : ffffffd0e0f16103 x0 : 0000000000000005 > Call trace: > do_undefinstr+0x23c/0x2b4 > el1_undef+0x2c/0x44 > el1_sync_handler+0xa4/0xb0 > el1_sync+0x74/0x100 > cmdline_proc_show+0xc/0x44 > proc_reg_read_iter+0xb0/0xc4 > new_sync_read+0x10c/0x15c > vfs_read+0x144/0x18c > ksys_read+0x78/0xe8 > __arm64_sys_read+0x24/0x30 We compare first 6 instructions of 'cmdline_proc_show' before and after patch (see below). There are 4 instructions modified, so this is case that offset between old and new function is out of 128M. And we found that instruction at 'cmdline_proc_show+0xc' seems incorrect (it expects to be '00021fd6'). origin: patched: -------- -------- fd7bbea9 929ff7f0 21d500f0 f2a91b30 fd030091 f2d00010 211040f9 d61f0200 <-- cmdline_proc_show+0xc (expect is '00021fd6') f30b00f9 f30b00f9 f30300aa f30300aa It is caused by an incorrect big-to-little endian conversion, and we correct it. Fixes: e429c61d livepatch/arm64: Support livepatch without ftrace Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53WZ9 -------------------------------- Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53WZ9 -------------------------------- Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53WZ9 -------------------------------- Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53WZ9 -------------------------------- Currently when unpatch a function, we check whether 'func_stack' has only one item then delete it: > if (list_is_singular(&func_node->func_stack)) { > list_del_rcu(&func->stack_node); > ...... > } else { > list_del_rcu(&func->stack_node); > next_func = list_first_or_null_rcu(&func_node->func_stack); > ...... > } We can optimize it as delete first then check whether 'func_stack' is empty or not. Suggested-by: NXu Kuohai <xukuohai@huawei.com> Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53WZ9 -------------------------------- Structure 'arch_klp_data' contains fields which are used to save codes of a function before patching. In arm, they are 'old_insns' and 'old_insn' (depending on CONFIG_ARM_MODULE_PLTS enabled or not): struct arch_klp_data { #ifdef CONFIG_ARM_MODULE_PLTS u32 old_insns[LJMP_INSN_SIZE]; #else u32 old_insn; #endif }; We can use array 'old_insns' to replace 'old_insn' so that no need to depend on CONFIG_ARM_MODULE_PLTS. The similar scenario exists in arm64, so we also do the optimization. Suggested-by: NXu Kuohai <xukuohai@huawei.com> Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53WZ9 -------------------------------- Before commit ec7ce700674f ("[Huawei] livepatch: put memory alloc and free out stop machine"), procedure of restore codes of old function in 'arch_klp_unpatch_func' is like: 1. copy old codes which saved in func_node into array 'old_insns'; 2. free memory of func_node; 3. patch text with old codes in array 'old_insns'; But after above commit, operation of freeing memory of func_node in procedure 2 is done after 'arch_klp_unpatch_func' succeed. And then operation of copying old codes in procedure 1 seems redundant, so we can just remove it. Suggested-by: NXu Kuohai <xukuohai@huawei.com> Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Yejian 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53WZ9 -------------------------------- Codes related to patching text in 'arch_klp_patch_func' and 'arch_klp_unpatch_func' are duplicate, we can reduce them. And There is issue in arm/arm64 that 'offset' between pc and new function address is out of valid range is NOT considered if MODULE_PLTS is not enabled (CONFIG_ARM_MODULE_PLTS in arm, CONFIG_ARM_MODULE_PLTS in arm64). We fix it by always checking that 'offset'. Fixes: 2fa9f353 livepatch/arm: Support livepatch without ftrace Fixes: e429c61d livepatch/arm64: Support livepatch without ftrace Suggested-by: NXu Kuohai <xukuohai@huawei.com> Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Jian 提交于
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53VVE CVE: NA ------------------------------------------------- Collect the processes who have the page mapped via collect_procs(). @page if the page is a part of the hugepages/compound-page, we must using compound_head() to find it's head page to prevent the kernel panic, and make the page be locked. @to_kill the function will return a linked list, when we have used this list, we must kfree the list. @force_early if we want to find all process, we must make it be true, if it's false, the function will only return the process who have PF_MCE_PROCESS or PF_MCE_EARLY mark. limits: if force_early is true, sysctl_memory_failure_early_kill is useless. If it's false, no process have PF_MCE_PROCESS and PF_MCE_EARLY flag, and the sysctl_memory_failure_early_kill is enabled, function will return all tasks whether the task have the PF_MCE_PROCESS and PF_MCE_EARLY flag. Signed-off-by: NZhang Jian <zhangjian210@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paul E. McKenney 提交于
mainline inclusion from mainline-v5.17-rc1 commit 147f04b1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I53ILL CVE: NA ------------------------------------------------------------------------- If an RCU expedited grace period starts just when a CPU is in the process of going offline, so that the outgoing CPU has completed its pass through stop-machine but has not yet completed its final dive into the idle loop, RCU will attempt to enable that CPU's scheduling-clock tick via a call to tick_dep_set_cpu(). For this to happen, that CPU has to have been online when the expedited grace period completed its CPU-selection phase. This is pointless: The outgoing CPU has interrupts disabled, so it cannot take a scheduling-clock tick anyway. In addition, the tick_dep_set_cpu() function's eventual call to irq_work_queue_on() will splat as follows: smpboot: CPU 1 is now offline WARNING: CPU: 6 PID: 124 at kernel/irq_work.c:95 +irq_work_queue_on+0x57/0x60 Modules linked in: CPU: 6 PID: 124 Comm: kworker/6:2 Not tainted 5.15.0-rc1+ #3 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS +rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014 Workqueue: rcu_gp wait_rcu_exp_gp RIP: 0010:irq_work_queue_on+0x57/0x60 Code: 8b 05 1d c7 ea 62 a9 00 00 f0 00 75 21 4c 89 ce 44 89 c7 e8 +9b 37 fa ff ba 01 00 00 00 89 d0 c3 4c 89 cf e8 3b ff ff ff eb ee <0f> 0b eb b7 +0f 0b eb db 90 48 c7 c0 98 2a 02 00 65 48 03 05 91 6f RSP: 0000:ffffb12cc038fe48 EFLAGS: 00010282 RAX: 0000000000000001 RBX: 0000000000005208 RCX: 0000000000000020 RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9ad01f45a680 RBP: 000000000004c990 R08: 0000000000000001 R09: ffff9ad01f45a680 R10: ffffb12cc0317db0 R11: 0000000000000001 R12: 00000000fffecee8 R13: 0000000000000001 R14: 0000000000026980 R15: ffffffff9e53ae00 FS: 0000000000000000(0000) GS:ffff9ad01f580000(0000) +knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000000de0c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tick_nohz_dep_set_cpu+0x59/0x70 rcu_exp_wait_wake+0x54e/0x870 ? sync_rcu_exp_select_cpus+0x1fc/0x390 process_one_work+0x1ef/0x3c0 ? process_one_work+0x3c0/0x3c0 worker_thread+0x28/0x3c0 ? process_one_work+0x3c0/0x3c0 kthread+0x115/0x140 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 ---[ end trace c5bf75eb6aa80bc6 ]--- This commit therefore avoids invoking tick_dep_set_cpu() on offlined CPUs to limit both futility and false-positive splats. Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v5.10.97 commit 176356550cedc166f23a9ec43e4b95bc224a6313 bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=176356550cedc166f23a9ec43e4b95bc224a6313 -------------------------------- commit b67985be upstream. tcp_shift_skb_data() might collapse three packets into a larger one. P_A, P_B, P_C -> P_ABC Historically, it used a single tcp_skb_can_collapse_to(P_A) call, because it was enough. In commit 85712484 ("tcp: coalesce/collapse must respect MPTCP extensions"), this call was replaced by a call to tcp_skb_can_collapse(P_A, P_B) But the now needed test over P_C has been missed. This probably broke MPTCP. Then later, commit 9b65b17d ("net: avoid double accounting for pure zerocopy skbs") added an extra condition to tcp_skb_can_collapse(), but the missing call from tcp_shift_skb_data() is also breaking TCP zerocopy, because P_A and P_C might have different skb_zcopy_pure() status. Fixes: 85712484 ("tcp: coalesce/collapse must respect MPTCP extensions") Fixes: 9b65b17d ("net: avoid double accounting for pure zerocopy skbs") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Mat Martineau <mathew.j.martineau@linux.intel.com> Cc: Talal Ahmad <talalahmad@google.com> Cc: Arjun Roy <arjunroy@google.com> Cc: Willem de Bruijn <willemb@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NPaolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220201184640.756716-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v5.10.97 commit 32e179971085832f5335e308774a04dd1147a316 bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=32e179971085832f5335e308774a04dd1147a316 -------------------------------- commit e42e70ad upstream. When packet_setsockopt( PACKET_FANOUT_DATA ) reads po->fanout, no lock is held, meaning that another thread can change po->fanout. Given that po->fanout can only be set once during the socket lifetime (it is only cleared from fanout_release()), we can use READ_ONCE()/WRITE_ONCE() to document the race. BUG: KCSAN: data-race in packet_setsockopt / packet_setsockopt write to 0xffff88813ae8e300 of 8 bytes by task 14653 on cpu 0: fanout_add net/packet/af_packet.c:1791 [inline] packet_setsockopt+0x22fe/0x24a0 net/packet/af_packet.c:3931 __sys_setsockopt+0x209/0x2a0 net/socket.c:2180 __do_sys_setsockopt net/socket.c:2191 [inline] __se_sys_setsockopt net/socket.c:2188 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2188 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88813ae8e300 of 8 bytes by task 14654 on cpu 1: packet_setsockopt+0x691/0x24a0 net/packet/af_packet.c:3935 __sys_setsockopt+0x209/0x2a0 net/socket.c:2180 __do_sys_setsockopt net/socket.c:2191 [inline] __se_sys_setsockopt net/socket.c:2188 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2188 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000000000000000 -> 0xffff888106f8c000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 14654 Comm: syz-executor.3 Not tainted 5.16.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 47dceb8e ("packet: add classic BPF fanout mode") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220201022358.330621-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Tianchen Ding 提交于
stable inclusion from stable-v5.10.97 commit aa9e96db3121c65f6459912108fe3d3f35eafd62 bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=aa9e96db3121c65f6459912108fe3d3f35eafd62 -------------------------------- commit c80d401c upstream. subparts_cpus should be limited as a subset of cpus_allowed, but it is updated wrongly by using cpumask_andnot(). Use cpumask_and() instead to fix it. Fixes: ee8dde0c ("cpuset: Add new v2 cpuset.sched.partition flag") Signed-off-by: NTianchen Ding <dtcccc@linux.alibaba.com> Reviewed-by: NWaiman Long <longman@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v5.10.97 commit 3bbe2019dd12b8d13671ee6cda055d49637b4c39 bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3bbe2019dd12b8d13671ee6cda055d49637b4c39 -------------------------------- commit c6f6f244 upstream. While looking at one unrelated syzbot bug, I found the replay logic in __rtnl_newlink() to potentially trigger use-after-free. It is better to clear master_dev and m_ops inside the loop, in case we have to replay it. Fixes: ba7d49b1 ("rtnetlink: provide api for getting and setting slave info") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20220201012106.216495-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dan Carpenter 提交于
stable inclusion from stable-v5.10.97 commit 7b4741644cf718c422187e74fb07661ef1d68e85 bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7b4741644cf718c422187e74fb07661ef1d68e85 -------------------------------- commit ee125951 upstream. This code calls fd_install() which gives the userspace access to the fd. Then if copy_info_records_to_user() fails it calls put_unused_fd(fd) but that will not release it and leads to a stale entry in the file descriptor table. Generally you can't trust the fd after a call to fd_install(). The fix is to delay the fd_install() until everything else has succeeded. Fortunately it requires CAP_SYS_ADMIN to reach this code so the security impact is less. Fixes: f644bc44 ("fanotify: fix copy_event_to_user() fid error clean up") Link: https://lore.kernel.org/r/20220128195656.GA26981@kiliSigned-off-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NMathias Krause <minipli@grsecurity.net> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shyam Sundar S K 提交于
stable inclusion from stable-v5.10.97 commit 4d3fcfe8464838b3920bc2b939d888e0b792934e bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4d3fcfe8464838b3920bc2b939d888e0b792934e -------------------------------- commit 5aac9108 upstream. There will be BUG_ON() triggered in include/linux/skbuff.h leading to intermittent kernel panic, when the skb length underflow is detected. Fix this by dropping the packet if such length underflows are seen because of inconsistencies in the hardware descriptors. Fixes: 622c36f1 ("amd-xgbe: Fix jumbo MTU processing on newer hardware") Suggested-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NShyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: NTom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20220127092003.2812745-1-Shyam-sundar.S-k@amd.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Raju Rangoju 提交于
stable inclusion from stable-v5.10.97 commit cadfa7dce526334d7ae1425cdc66c626f8adfbf5 bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=cadfa7dce526334d7ae1425cdc66c626f8adfbf5 -------------------------------- commit 7674b7b5 upstream. Ensure to reset the tx_timer_active flag in xgbe_stop(), otherwise a port restart may result in tx timeout due to uncleared flag. Fixes: c635eaac ("amd-xgbe: Remove Tx coalescing") Co-developed-by: NSudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: NSudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: NRaju Rangoju <Raju.Rangoju@amd.com> Acked-by: NTom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20220127060222.453371-1-Raju.Rangoju@amd.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Georgi Valkov 提交于
stable inclusion from stable-v5.10.97 commit 77534b114f240d8a3296cfc576f0608880d2e5ed bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=77534b114f240d8a3296cfc576f0608880d2e5ed -------------------------------- commit 63e4b45c upstream. When rx_buf is allocated we need to account for IPHETH_IP_ALIGN, which reduces the usable size by 2 bytes. Otherwise we have 1512 bytes usable instead of 1514, and if we receive more than 1512 bytes, ipheth_rcvbulk_callback is called with status -EOVERFLOW, after which the driver malfunctiones and all communication stops. Resolves ipheth 2-1:4.2: ipheth_rcvbulk_callback: urb status: -75 Fixes: f33d9e2b ("usbnet: ipheth: fix connectivity with iOS 14") Signed-off-by: NGeorgi Valkov <gvalkov@abv.bg> Tested-by: NJan Kiszka <jan.kiszka@siemens.com> Link: https://lore.kernel.org/all/B60B8A4B-92A0-49B3-805D-809A2433B46C@abv.bg/ Link: https://lore.kernel.org/all/24851bd2769434a5fc24730dce8e8a984c5a4505.1643699778.git.jan.kiszka@siemens.com/Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Maor Dickman 提交于
stable inclusion from stable-v5.10.97 commit b4ced7a46d9f51d3b48ad7c024da288723afacaf bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b4ced7a46d9f51d3b48ad7c024da288723afacaf -------------------------------- commit d8e5883d upstream. The variable modact is not initialized before used in command modify header allocation which can cause command to fail. Fix by initializing modact with zeros. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 8f1e0b97 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: NMaor Dickman <maord@nvidia.com> Reviewed-by: NRoi Dayan <roid@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Maher Sanalla 提交于
stable inclusion from stable-v5.10.97 commit 502c37b033fab7cde3e95a570af4f073306be45e bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=502c37b033fab7cde3e95a570af4f073306be45e -------------------------------- commit 3c5193a8 upstream. Substitute del_timer() with del_timer_sync() in fw reset polling deactivation flow, in order to prevent a race condition which occurs when del_timer() is called and timer is deactivated while another process is handling the timer interrupt. A situation that led to the following call trace: RIP: 0010:run_timer_softirq+0x137/0x420 <IRQ> recalibrate_cpu_khz+0x10/0x10 ktime_get+0x3e/0xa0 ? sched_clock_cpu+0xb/0xc0 __do_softirq+0xf5/0x2ea irq_exit_rcu+0xc1/0xf0 sysvec_apic_timer_interrupt+0x9e/0xc0 asm_sysvec_apic_timer_interrupt+0x12/0x20 </IRQ> Fixes: 38b9f903 ("net/mlx5: Handle sync reset request event") Signed-off-by: NMaher Sanalla <msanalla@nvidia.com> Reviewed-by: NMoshe Shemesh <moshe@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Maor Dickman 提交于
stable inclusion from stable-v5.10.97 commit a01ee1b8165f4161459b5ec4e728bc7130fe8cd4 bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a01ee1b8165f4161459b5ec4e728bc7130fe8cd4 -------------------------------- commit ec41332e upstream. Current implementation of bond netevent handler only check if the handled netdev is VF representor and it missing a check if the VF representor is on the same phys device of the bond handling the netevent. Fix by adding the missing check and optimizing the check if the netdev is VF representor so it will not access uninitialized private data and crashes. BUG: kernel NULL pointer dereference, address: 000000000000036c PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Workqueue: eth3bond0 bond_mii_monitor [bonding] RIP: 0010:mlx5e_is_uplink_rep+0xc/0x50 [mlx5_core] RSP: 0018:ffff88812d69fd60 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8881cf800000 RCX: 0000000000000000 RDX: ffff88812d69fe10 RSI: 000000000000001b RDI: ffff8881cf800880 RBP: ffff8881cf800000 R08: 00000445cabccf2b R09: 0000000000000008 R10: 0000000000000004 R11: 0000000000000008 R12: ffff88812d69fe10 R13: 00000000fffffffe R14: ffff88820c0f9000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88846fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000036c CR3: 0000000103d80006 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5e_eswitch_uplink_rep+0x31/0x40 [mlx5_core] mlx5e_rep_is_lag_netdev+0x94/0xc0 [mlx5_core] mlx5e_rep_esw_bond_netevent+0xeb/0x3d0 [mlx5_core] raw_notifier_call_chain+0x41/0x60 call_netdevice_notifiers_info+0x34/0x80 netdev_lower_state_changed+0x4e/0xa0 bond_mii_monitor+0x56b/0x640 [bonding] process_one_work+0x1b9/0x390 worker_thread+0x4d/0x3d0 ? rescuer_thread+0x350/0x350 kthread+0x124/0x150 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Fixes: 7e51891a ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule") Signed-off-by: NMaor Dickman <maord@nvidia.com> Reviewed-by: NRoi Dayan <roid@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Maxime Ripard 提交于
stable inclusion from stable-v5.10.97 commit ac4ba79bb02881ed714adaa89faee601a18bff6d bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ac4ba79bb02881ed714adaa89faee601a18bff6d -------------------------------- Commit 20b0dfa8 upstream. The original commit depended on a rework commit (724fc856 ("drm/vc4: hdmi: Split the CEC disable / enable functions in two")) that (rightfully) didn't reach stable. However, probably because the context changed, when the patch was applied to stable the pm_runtime_put called got moved to the end of the vc4_hdmi_cec_adap_enable function (that would have become vc4_hdmi_cec_disable with the rework) to vc4_hdmi_cec_init. This means that at probe time, we now drop our reference to the clocks and power domains and thus end up with a CPU hang when the CPU tries to access registers. The call to pm_runtime_resume_and_get() is also problematic since the .adap_enable CEC hook is called both to enable and to disable the controller. That means that we'll now call pm_runtime_resume_and_get() at disable time as well, messing with the reference counting. The behaviour we should have though would be to have pm_runtime_resume_and_get() called when the CEC controller is enabled, and pm_runtime_put when it's disabled. We need to move things around a bit to behave that way, but it aligns stable with upstream. Cc: <stable@vger.kernel.org> # 5.10.x Cc: <stable@vger.kernel.org> # 5.15.x Cc: <stable@vger.kernel.org> # 5.16.x Reported-by: NMichael Stapelberg <michael+drm@stapelberg.ch> Signed-off-by: NMaxime Ripard <maxime@cerno.tech> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-