- 27 12月, 2019 40 次提交
-
-
由 Peng Sun 提交于
mainline inclusion from mainline-5.0 commit 781e62823cb81b972dc8652c1827205cda2ac9ac category: bugfix bugzilla: 11101 CVE: NA ------------------------------------------------- In bpf/syscall.c, bpf_map_get_fd_by_id() use bpf_map_inc_not_zero() to increase the refcount, both map->refcnt and map->usercnt. Then, if bpf_map_new_fd() fails, should handle map->usercnt too. Fixes: bd5f5f4e ("bpf: Add BPF_MAP_GET_FD_BY_ID") Signed-off-by: NPeng Sun <sironhide0null@gmail.com> Acked-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> (cherry picked from commit 781e62823cb81b972dc8652c1827205cda2ac9ac) Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Greg Kroah-Hartman 提交于
[ Upstream commit 2c1cf00eeacb784781cf1c9896b8af001246d339 ] If create_buf_file() returns an error, don't try to reference it later as a valid dentry pointer. This problem was exposed when debugfs started to return errors instead of just NULL for some calls when they do not succeed properly. Also, the check for WARN_ON(dentry) was just wrong :) Reported-by: NKees Cook <keescook@chromium.org> Reported-and-tested-by: syzbot+16c3a70e1e9b29346c43@syzkaller.appspotmail.com Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Fixes: ff9fb72bc077 ("debugfs: return error values, not NULL") Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Stephane Eranian 提交于
[ Upstream commit 1a51c5da5acc6c188c917ba572eebac5f8793432 ] The perf_proc_update_handler() handles /proc/sys/kernel/perf_event_max_sample_rate syctl variable. When the PMU IRQ handler timing monitoring is disabled, i.e, when /proc/sys/kernel/perf_cpu_time_max_percent is equal to 0 or 100, then no modification to sysctl_perf_event_sample_rate is allowed to prevent possible hang from wrong values. The problem is that the test to prevent modification is made after the sysctl variable is modified in perf_proc_update_handler(). You get an error: $ echo 10001 >/proc/sys/kernel/perf_event_max_sample_rate echo: write error: invalid argument But the value is still modified causing all sorts of inconsistencies: $ cat /proc/sys/kernel/perf_event_max_sample_rate 10001 This patch fixes the problem by moving the parsing of the value after the test. Committer testing: # echo 100 > /proc/sys/kernel/perf_cpu_time_max_percent # echo 10001 > /proc/sys/kernel/perf_event_max_sample_rate -bash: echo: write error: Invalid argument # cat /proc/sys/kernel/perf_event_max_sample_rate 10001 # Signed-off-by: NStephane Eranian <eranian@google.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1547169436-6266-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Daniel Borkmann 提交于
commit 3612af783cf52c74a031a2f11b82247b2599d3cd upstream. Marek reported that he saw an issue with the below snippet in that timing measurements where off when loaded as unpriv while results were reasonable when loaded as privileged: [...] uint64_t a = bpf_ktime_get_ns(); uint64_t b = bpf_ktime_get_ns(); uint64_t delta = b - a; if ((int64_t)delta > 0) { [...] Turns out there is a bug where a corner case is missing in the fix d3bd7413e0ca ("bpf: fix sanitation of alu op with pointer / scalar type from different paths"), namely fixup_bpf_calls() only checks whether aux has a non-zero alu_state, but it also needs to test for the case of BPF_ALU_NON_POINTER since in both occasions we need to skip the masking rewrite (as there is nothing to mask). Fixes: d3bd7413e0ca ("bpf: fix sanitation of alu op with pointer / scalar type from different paths") Reported-by: NMarek Majkowski <marek@cloudflare.com> Reported-by: NArthur Fabre <afabre@cloudflare.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/netdev/CAJPywTJqP34cK20iLM5YmUMz9KXQOdu1-+BZrGMAGgLuBWz7fg@mail.gmail.com/T/Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Pavel Tikhomirov 提交于
commit 6a072128d262d2b98d31626906a96700d1fc11eb upstream. Then tracing syscall exit event it is extremely useful to filter exit codes equal to some negative value, to react only to required errors. But negative numbers does not work: [root@snorch sys_exit_read]# echo "ret == -1" > filter bash: echo: write error: Invalid argument [root@snorch sys_exit_read]# cat filter ret == -1 ^ parse_error: Invalid value (did you forget quotes)? Similar thing happens when setting triggers. These is a regression in v4.17 introduced by the commit mentioned below, testing without these commit shows no problem with negative numbers. Link: http://lkml.kernel.org/r/20180823102534.7642-1-ptikhomirov@virtuozzo.com Cc: stable@vger.kernel.org Fixes: 80765597 ("tracing: Rewrite filter logic to be simpler and faster") Signed-off-by: NPavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Alexander Shishkin 提交于
euler inclusion category: bugfix bugzilla: 9513/11006/11050 CVE: NA -------------------------------------------------- [ Cheng Jian HULK-Syzkaller reported a problem which has been reported to mainline(lkml) by syzbot early, this patch comes from the reply form lkml. v1 https://lkml.org/lkml/2019/2/28/529 v2 https://lkml.org/lkml/2019/3/8/206 we merged v1 first but cause bugzilla #11050, it was because : we also use perf_remove_from_context() in perf_event_open() when we move events from a SW context to a HW context, so we can't destroy the event here. now v2 will not exhibit that warning. it's same to another patch at https://lkml.org/lkml/2019/3/8/536. but more clear than it.] First, we have a race between perf_event_release_kernel() and perf_free_event(), which happens when parent's event is released while the child's fork fails (because of a fatal signal, for example), that looks like this: cpu X cpu Y ----- ----- copy_process() error path perf_release(parent) +->perf_event_free_task() +-> lock(child_ctx->mutex) | | +-> remove_from_context(child) | | +-> unlock(child_ctx->mutex) | | | | +-> lock(child_ctx->mutex) | | +-> unlock(child_ctx->mutex) | +-> free_task(child_task) +-> put_task_struct(child_task) Technically, we're still holding a reference to the task via parent->hw.target, that's not stopping free_task(), so we end up poking at free'd memory, as is pointed out by KASAN in the syzkaller report (see Link below). The straightforward fix is to drop the hw.target reference while the task is still around. Therein lies the second problem: the users of hw.target (uprobe) assume that it's around at ->destroy() callback time, where they use it for context. So, in order to not break the uprobe teardown and avoid leaking stuff, we need to call ->destroy() at the same time. This patch fixes the race and the subsequent fallout by doing both these things at remove_from_context time. Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://syzkaller.appspot.com/bug?extid=a24c397a29ad22d86c98Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Andrea Arcangeli 提交于
euler inclusion category: bugfix bugzilla: 10989 CVE: NA ------------------------------------------------ MEMCG depends on the task structure not to be freed under rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences mm->owner. A better fix would be to avoid registering forked vmas in userfaultfd contexts reported to the monitor, if case fork ends up failing. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Signed-off-by: Nzhong jiang <zhongjiang@huawei.com> Reviewed-by: NMiao Xie <miaoxie@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Cheng Jian 提交于
euler inclusion category: bugfix bugzilla: 9513/11006 CVE: NA -------------------------------------------------- This reverts commit b772baf9a14ab4975e8884a399a4e0bab2fb6bf9. we merge the patch b772baf9a14a ("perf: Paper over the hw.target problems") to reslove an use-after-free issue (bugzilla #9513/#11006). but it cause some new problem (bugzilla #11050/#11049) in this version. So just revert it. Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Xiongfeng Wang 提交于
euler inclusion category: feature Bugzilla: 10876 CVE: N/A ---------------------------------------- When I ran Syzkaller testsuite, I got the following call trace. Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> ================================================================================ UBSAN: Undefined behaviour in ./include/linux/time64.h:120:27 signed integer overflow: 8243129037239968815 * 1000000000 cannot be represented in type 'long long int' CPU: 5 PID: 28854 Comm: syz-executor.1 Not tainted 4.19.24 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 handle_overflow+0x193/0x1e2 lib/ubsan.c:190 timespec64_to_ns include/linux/time64.h:120 [inline] posix_cpu_timer_set+0x95a/0xb70 kernel/time/posix-cpu-timers.c:687 do_timer_settime+0x198/0x2a0 kernel/time/posix-timers.c:892 __do_sys_timer_settime kernel/time/posix-timers.c:918 [inline] __se_sys_timer_settime kernel/time/posix-timers.c:904 [inline] __x64_sys_timer_settime+0x18d/0x260 kernel/time/posix-timers.c:904 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462eb9 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f14e4127c58 EFLAGS: 00000246 ORIG_RAX: 00000000000000df RAX: ffffffffffffffda RBX: 000000000073bfa0 RCX: 0000000000462eb9 RDX: 0000000020000080 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f14e41286bc R13: 00000000004c54cc R14: 0000000000704278 R15: 00000000ffffffff ================================================================================ It is because 'it_interval.tv_sec' is larger than 'KTIME_SEC_MAX' and 'it_interval.tv_sec * NSEC_PER_SEC' overflows in 'timespec64_to_ns()'. This patch checks whether 'it_interval.tv_sec' is larger than 'KTIME_SEC_MAX' and saturate if that is the case. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
euler inclusion category: feature Bugzilla: 11009 CVE: N/A ---------------------------------------- When I ran Syzkaller testsuite, I got the following call trace. Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> ================================================================================ UBSAN: Undefined behaviour in kernel/time/ntp.c:457:16 signed integer overflow: 9223372036854775807 + 500 cannot be represented in type 'long int' CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.25-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 handle_overflow+0x193/0x1e2 lib/ubsan.c:190 second_overflow+0x403/0x540 kernel/time/ntp.c:457 accumulate_nsecs_to_secs kernel/time/timekeeping.c:2002 [inline] logarithmic_accumulation kernel/time/timekeeping.c:2046 [inline] timekeeping_advance+0x2bb/0xec0 kernel/time/timekeeping.c:2114 tick_do_update_jiffies64.part.2+0x1a0/0x350 kernel/time/tick-sched.c:97 tick_do_update_jiffies64 kernel/time/tick-sched.c:1229 [inline] tick_nohz_update_jiffies kernel/time/tick-sched.c:499 [inline] tick_nohz_irq_enter kernel/time/tick-sched.c:1232 [inline] tick_irq_enter+0x1fd/0x240 kernel/time/tick-sched.c:1249 irq_enter+0xc4/0x100 kernel/softirq.c:353 entering_irq arch/x86/include/asm/apic.h:517 [inline] entering_ack_irq arch/x86/include/asm/apic.h:523 [inline] smp_apic_timer_interrupt+0x20/0x480 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:864 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58 Code: 01 f0 0f 82 bc fd ff ff 48 c7 c7 c0 21 b1 83 e8 a1 0a 02 ff e9 ab fd ff ff 4c 89 e7 e8 77 b6 a5 fe e9 6a ff ff ff 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffff888106307d20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000007 RBX: dffffc0000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881062e4f1c RBP: 0000000000000003 R08: ffffed107c5dc77b R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff848c78a0 R13: 0000000000000003 R14: 1ffff11020c60fae R15: 0000000000000000 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline] default_idle+0x24/0x2b0 arch/x86/kernel/process.c:561 cpuidle_idle_call kernel/sched/idle.c:153 [inline] do_idle+0x2ca/0x420 kernel/sched/idle.c:262 cpu_startup_entry+0xcb/0xe0 kernel/sched/idle.c:368 start_secondary+0x421/0x570 arch/x86/kernel/smpboot.c:271 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243 ================================================================================ It is because time_maxerror is set as 0x7FFFFFFFFFFFFFFF by user. It overflows when we add it with 'MAXFREQ / NSEC_PER_USEC' in 'second_overflow()'. This patch add a limit check and saturate it when the user set 'time_maxerror'. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alexei Starovoitov 提交于
mainline inclusion from mainline-5.0-rc8 commit 3defaf2f15b2 category: bugfix bugzilla: 10760 CVE: NA ------------------------------------------------- Lockdep warns about false positive: [ 11.211460] ------------[ cut here ]------------ [ 11.211936] DEBUG_LOCKS_WARN_ON(depth <= 0) [ 11.211985] WARNING: CPU: 0 PID: 141 at ../kernel/locking/lockdep.c:3592 lock_release+0x1ad/0x280 [ 11.213134] Modules linked in: [ 11.214954] RIP: 0010:lock_release+0x1ad/0x280 [ 11.223508] Call Trace: [ 11.223705] <IRQ> [ 11.223874] ? __local_bh_enable+0x7a/0x80 [ 11.224199] up_read+0x1c/0xa0 [ 11.224446] do_up_read+0x12/0x20 [ 11.224713] irq_work_run_list+0x43/0x70 [ 11.225030] irq_work_run+0x26/0x50 [ 11.225310] smp_irq_work_interrupt+0x57/0x1f0 [ 11.225662] irq_work_interrupt+0xf/0x20 since rw_semaphore is released in a different task vs task that locked the sema. It is expected behavior. Fix the warning with up_read_non_owner() and rwsem_release() annotation. Fixes: bae77c5e ("bpf: enable stackmap with build_id in nmi context") Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alexander Shishkin 提交于
euler inclusion category: bugfix bugzilla: 9513/11006 CVE: NA -------------------------------------------------- [ Cheng Jian HULK-Syzkaller reported a problem which has been reported to mainline(lkml) by syzbot early, this patch comes from the reply form lkml. https://lkml.org/lkml/2019/2/28/529 ] First, we have a race between perf_event_release_kernel() and perf_free_event(), which happens when parent's event is released while the child's fork fails (because of a fatal signal, for example), that looks like this: cpu X cpu Y ----- ----- copy_process() error path perf_release(parent) +->perf_event_free_task() +-> lock(child_ctx->mutex) | | +-> remove_from_context(child) | | +-> unlock(child_ctx->mutex) | | | | +-> lock(child_ctx->mutex) | | +-> unlock(child_ctx->mutex) | +-> free_task(child_task) +-> put_task_struct(child_task) Technically, we're still holding a reference to the task via parent->hw.target, that's not stopping free_task(), so we end up poking at free'd memory, as is pointed out by KASAN in the syzkaller report (see Link below). The straightforward fix is to drop the hw.target reference while the task is still around. Therein lies the second problem: the users of hw.target (uprobe) assume that it's around at ->destroy() callback time, where they use it for context. So, in order to not break the uprobe teardown and avoid leaking stuff, we need to call ->destroy() at the same time. This patch fixes the race and the subsequent fallout by doing both these things at remove_from_context time. Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://syzkaller.appspot.com/bug?extid=a24c397a29ad22d86c98 Reported-by: syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Peter Zijlstra 提交于
[ Upstream commit 4c4e3731564c8945ac5ac90fc2a1e1f21cb79c92 ] Notable cmpxchg() does not provide ordering when it fails, however wake_q_add() requires ordering in this specific case too. Without this it would be possible for the concurrent wakeup to not observe our prior state. Andrea Parri provided: C wake_up_q-wake_q_add { int next = 0; int y = 0; } P0(int *next, int *y) { int r0; /* in wake_up_q() */ WRITE_ONCE(*next, 1); /* node->next = NULL */ smp_mb(); /* implied by wake_up_process() */ r0 = READ_ONCE(*y); } P1(int *next, int *y) { int r1; /* in wake_q_add() */ WRITE_ONCE(*y, 1); /* wake_cond = true */ smp_mb__before_atomic(); r1 = cmpxchg_relaxed(next, 1, 2); } exists (0:r0=0 /\ 1:r1=0) This "exists" clause cannot be satisfied according to the LKMM: Test wake_up_q-wake_q_add Allowed States 3 0:r0=0; 1:r1=1; 0:r0=1; 1:r1=0; 0:r0=1; 1:r1=1; No Witnesses Positive: 0 Negative: 3 Condition exists (0:r0=0 /\ 1:r1=0) Observation wake_up_q-wake_q_add Never 0 3 Reported-by: NYongji Xie <elohimes@gmail.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Srinivas Ramana 提交于
[ Upstream commit bddda606ec76550dd63592e32a6e87e7d32583f7 ] If all CPUs in the irq_default_affinity mask are offline when an interrupt is initialized then irq_setup_affinity() can set an empty affinity mask for a newly allocated interrupt. Fix this by falling back to cpu_online_mask in case the resulting affinity mask is zero. Signed-off-by: NSrinivas Ramana <sramana@codeaurora.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: linux-arm-msm@vger.kernel.org Link: https://lkml.kernel.org/r/1545312957-8504-1-git-send-email-sramana@codeaurora.orgSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Long Li 提交于
[ Upstream commit e8da8794a7fd9eef1ec9a07f0d4897c68581c72b ] On large systems with multiple devices of the same class (e.g. NVMe disks, using managed interrupts), the kernel can affinitize these interrupts to a small subset of CPUs instead of spreading them out evenly. irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask of possible target CPUs which has the lowest number of interrupt vectors allocated. This is done by searching the CPU with the highest number of available vectors. While this is correct for non-managed CPUs it can select the wrong CPU for managed interrupts. Under certain constellations this results in affinitizing the managed interrupts of several devices to a single CPU in a set. The book keeping of available vectors works the following way: 1) Non-managed interrupts: available is decremented when the interrupt is actually requested by the device driver and a vector is assigned. It's incremented when the interrupt and the vector are freed. 2) Managed interrupts: Managed interrupts guarantee vector reservation when the MSI/MSI-X functionality of a device is enabled, which is achieved by reserving vectors in the bitmaps of the possible target CPUs. This reservation decrements the available count on each possible target CPU. When the interrupt is requested by the device driver then a vector is allocated from the reserved region. The operation is reversed when the interrupt is freed by the device driver. Neither of these operations affect the available count. The reservation persist up to the point where the MSI/MSI-X functionality is disabled and only this operation increments the available count again. For non-managed interrupts the available count is the correct selection criterion because the guaranteed reservations need to be taken into account. Using the allocated counter could lead to a failing allocation in the following situation (total vector space of 10 assumed): CPU0 CPU1 available: 2 0 allocated: 5 3 <--- CPU1 is selected, but available space = 0 managed reserved: 3 7 while available yields the correct result. For managed interrupts the available count is not the appropriate selection criterion because as explained above the available count is not affected by the actual vector allocation. The following example illustrates that. Total vector space of 10 assumed. The starting point is: CPU0 CPU1 available: 5 4 allocated: 2 3 managed reserved: 3 3 Allocating vectors for three non-managed interrupts will result in affinitizing the first two to CPU0 and the third one to CPU1 because the available count is adjusted with each allocation: CPU0 CPU1 available: 5 4 <- Select CPU0 for 1st allocation --> allocated: 3 3 available: 4 4 <- Select CPU0 for 2nd allocation --> allocated: 4 3 available: 3 4 <- Select CPU1 for 3rd allocation --> allocated: 4 4 But the allocation of three managed interrupts starting from the same point will affinitize all of them to CPU0 because the available count is not affected by the allocation (see above). So the end result is: CPU0 CPU1 available: 5 4 allocated: 5 3 Introduce a "managed_allocated" field in struct cpumap to track the vector allocation for managed interrupts separately. Use this information to select the target CPU when a vector is allocated for a managed interrupt, which results in more evenly distributed vector assignments. The above example results in the following allocations: CPU0 CPU1 managed_allocated: 0 0 <- Select CPU0 for 1st allocation --> allocated: 3 3 managed_allocated: 1 0 <- Select CPU1 for 2nd allocation --> allocated: 3 4 managed_allocated: 1 1 <- Select CPU0 for 3rd allocation --> allocated: 4 4 The allocation of non-managed interrupts is not affected by this change and is still evaluating the available count. The overall distribution of interrupt vectors for both types of interrupts might still not be perfectly even depending on the number of non-managed and managed interrupts in a system, but due to the reservation guarantee for managed interrupts this cannot be avoided. Expose the new field in debugfs as well. [ tglx: Clarified the background of the problem in the changelog and described it independent of NVME ] Signed-off-by: NLong Li <longli@microsoft.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Michael Kelley <mikelley@microsoft.com> Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Dou Liyang 提交于
[ Upstream commit 76f99ae5b54d48430d1f0c5512a84da0ff9761e0 ] Linux spreads out the non managed interrupt across the possible target CPUs to avoid vector space exhaustion. Managed interrupts are treated differently, as for them the vectors are reserved (with guarantee) when the interrupt descriptors are initialized. When the interrupt is requested a real vector is assigned. The assignment logic uses the first CPU in the affinity mask for assignment. If the interrupt has more than one CPU in the affinity mask, which happens when a multi queue device has less queues than CPUs, then doing the same search as for non managed interrupts makes sense as it puts the interrupt on the least interrupt plagued CPU. For single CPU affine vectors that's obviously a NOOP. Restructre the matrix allocation code so it does the 'best CPU' search, add the sanity check for an empty affinity mask and adapt the call site in the x86 vector management code. [ tglx: Added the empty mask check to the core and improved change log ] Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Dou Liyang 提交于
[ Upstream commit 8ffe4e61c06a48324cfd97f1199bb9838acce2f2 ] Linux finds the CPU which has the lowest vector allocation count to spread out the non managed interrupts across the possible target CPUs, but does not do so for managed interrupts. Split out the CPU selection code into a helper function for reuse. No functional change. Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20180908175838.14450-1-dou_liyang@163.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
euler inclusion category: feature Bugzilla: 10683 CVE: N/A ---------------------------------------- When I ran Syzkaller testsuite, I got the following call trace. Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> ================================================================================ UBSAN: Undefined behaviour in kernel/time/timekeeping.c:801:8 signed integer overflow: 500152103386 + 9223372036854775807 cannot be represented in type 'long long int' CPU: 6 PID: 13904 Comm: syz-executor.0 Not tainted 4.19.25 #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 handle_overflow+0x193/0x1e2 lib/ubsan.c:190 ktime_get_with_offset+0x26a/0x2d0 kernel/time/timekeeping.c:801 common_hrtimer_arm+0x14d/0x220 kernel/time/posix-timers.c:817 common_timer_set+0x337/0x530 kernel/time/posix-timers.c:863 do_timer_settime+0x198/0x290 kernel/time/posix-timers.c:892 __do_sys_timer_settime kernel/time/posix-timers.c:918 [inline] __se_sys_timer_settime kernel/time/posix-timers.c:904 [inline] __x64_sys_timer_settime+0x18d/0x260 kernel/time/posix-timers.c:904 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462eb9 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f7968072c58 EFLAGS: 00000246 ORIG_RAX: 00000000000000df RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462eb9 RDX: 00000000200000c0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f79680736bc R13: 00000000004c54cc R14: 0000000000704278 R15: 00000000ffffffff ================================================================================ It it because global variable 'offsets' is set with a very large but still valid value. It overflows when we add 'tk->tkr_mono.base' with 'offsets'. This patch use 'ktime_add_safe()' to limit the result to 'KTIME_SEC_MAX' when it overflows. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawe.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alban Crequy 提交于
mainline inclusion from mainline-5.0 commit 7c0cdf0b3940 category: bugfix bugzilla: 10936 CVE: NA ------------------------------------------------- trie_delete_elem() was deleting an entry even though it was not matching if the prefixlen was correct. This patch adds a check on matchlen. Reproducer: $ sudo bpftool map create /sys/fs/bpf/mylpm type lpm_trie key 8 value 1 entries 128 name mylpm flags 1 $ sudo bpftool map update pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 aa bb cc dd value hex 01 $ sudo bpftool map dump pinned /sys/fs/bpf/mylpm key: 10 00 00 00 aa bb cc dd value: 01 Found 1 element $ sudo bpftool map delete pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 ff ff ff ff $ echo $? 0 $ sudo bpftool map dump pinned /sys/fs/bpf/mylpm Found 0 elements A similar reproducer is added in the selftests. Without the patch: $ sudo ./tools/testing/selftests/bpf/test_lpm_map test_lpm_map: test_lpm_map.c:485: test_lpm_delete: Assertion `bpf_map_delete_elem(map_fd, key) == -1 && errno == ENOENT' failed. Aborted With the patch: test_lpm_map runs without errors. Fixes: e454cf59 ("bpf: Implement map_delete_elem for BPF_MAP_TYPE_LPM_TRIE") Cc: Craig Gallek <kraig@google.com> Signed-off-by: NAlban Crequy <alban@kinvolk.io> Acked-by: NCraig Gallek <kraig@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Eric Dumazet 提交于
mainline inclusion from mainline-5.0 commit 7c0cdf0b3940 category: bugfix bugzilla: 10936 CVE: NA ------------------------------------------------- At LPC 2018 in Vancouver, Vlad Dumitrescu mentioned that longest_prefix_match() has a high cost [1]. One reason for that cost is a loop handling one byte at a time. We can handle more bytes at a time, if enough attention is paid to endianness. I was able to remove ~55 % of longest_prefix_match() cpu costs. [1] https://linuxplumbersconf.org/event/2/contributions/88/attachments/76/87/lpc-bpf-2018-shaping.pdfSigned-off-by: NEric Dumazet <edumazet@google.com> Cc: Vlad Dumitrescu <vladum@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Stanislav Fomichev 提交于
[ Upstream commit 4af396ae4836c4ecab61e975b8e61270c551894d ] When returning BPF_STACK_BUILD_ID_IP from stack_map_get_build_id_offset, make sure that build_id field is empty. Since we are using percpu free list, there is a possibility that we might reuse some previous bpf_stack_build_id with non-zero build_id. Fixes: 615755a7 ("bpf: extend stackmap to save binary_build_id+offset instead of address") Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Stanislav Fomichev 提交于
[ Upstream commit 0b698005a9d11c0e91141ec11a2c4918a129f703 ] Build-id length is not fixed to 20, it can be (`man ld` /--build-id): * 128-bit (uuid) * 160-bit (sha1) * any length specified in ld --build-id=0xhexstring To fix the issue of missing BPF_STACK_BUILD_ID_VALID for shorter build-ids, assume that build-id is somewhere in the range of 1 .. 20. Set the remaining bytes to zero. v2: * don't introduce new "len = min(BPF_BUILD_ID_SIZE, nhdr->n_descsz)", we already know that nhdr->n_descsz <= BPF_BUILD_ID_SIZE if we enter this 'if' condition Fixes: 615755a7 ("bpf: extend stackmap to save binary_build_id+offset instead of address") Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Song Liu 提交于
[ Upstream commit beaf3d1901f4ea46fbd5c9d857227d99751de469 ] As Naresh reported, test_stacktrace_build_id() causes panic on i386 and arm32 systems. This is caused by page_address() returns NULL in certain cases. This patch fixes this error by using kmap_atomic/kunmap_atomic instead of page_address. Fixes: 615755a7 (" bpf: extend stackmap to save binary_build_id+offset instead of address") Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Quentin Perret 提交于
commit 9e7382153f80ba45a0bbcd540fb77d4b15f6e966 upstream. The following commit 441dae8f ("tracing: Add support for display of tgid in trace output") removed the call to print_event_info() from print_func_help_header_irq() which results in the ftrace header not reporting the number of entries written in the buffer. As this wasn't the original intent of the patch, re-introduce the call to print_event_info() to restore the orginal behaviour. Link: http://lkml.kernel.org/r/20190214152950.4179-1-quentin.perret@arm.comAcked-by: NJoel Fernandes <joelaf@google.com> Cc: stable@vger.kernel.org Fixes: 441dae8f ("tracing: Add support for display of tgid in trace output") Signed-off-by: NQuentin Perret <quentin.perret@arm.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Eric W. Biederman 提交于
commit cf43a757fd49442bc38f76088b70c2299eed2c2f upstream. In the middle of do_exit() there is there is a call "ptrace_event(PTRACE_EVENT_EXIT, code);" That call places the process in TACKED_TRACED aka "(TASK_WAKEKILL | __TASK_TRACED)" and waits for for the debugger to release the task or SIGKILL to be delivered. Skipping past dequeue_signal when we know a fatal signal has already been delivered resulted in SIGKILL remaining pending and TIF_SIGPENDING remaining set. This in turn caused the scheduler to not sleep in PTACE_EVENT_EXIT as it figured a fatal signal was pending. This also caused ptrace_freeze_traced in ptrace_check_attach to fail because it left a per thread SIGKILL pending which is what fatal_signal_pending tests for. This difference in signal state caused strace to report strace: Exit of unknown pid NNNNN ignored Therefore update the signal handling state like dequeue_signal would when removing a per thread SIGKILL, by removing SIGKILL from the per thread signal mask and clearing TIF_SIGPENDING. Acked-by: NOleg Nesterov <oleg@redhat.com> Reported-by: NOleg Nesterov <oleg@redhat.com> Reported-by: NIvan Delalande <colona@arista.com> Cc: stable@vger.kernel.org Fixes: 35634ffa1751 ("signal: Always notice exiting tasks") Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Andreas Ziegler 提交于
commit 0722069a5374b904ec1a67f91249f90e1cfae259 upstream. When printing multiple uprobe arguments as strings the output for the earlier arguments would also include all later string arguments. This is best explained in an example: Consider adding a uprobe to a function receiving two strings as parameters which is at offset 0xa0 in strlib.so and we want to print both parameters when the uprobe is hit (on x86_64): $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \ /sys/kernel/debug/tracing/uprobe_events When the function is called as func("foo", "bar") and we hit the probe, the trace file shows a line like the following: [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar" Note the extra "bar" printed as part of arg1. This behaviour stacks up for additional string arguments. The strings are stored in a dynamically growing part of the uprobe buffer by fetch_store_string() after copying them from userspace via strncpy_from_user(). The return value of strncpy_from_user() is then directly used as the required size for the string. However, this does not take the terminating null byte into account as the documentation for strncpy_from_user() cleary states that it "[...] returns the length of the string (not including the trailing NUL)" even though the null byte will be copied to the destination. Therefore, subsequent calls to fetch_store_string() will overwrite the terminating null byte of the most recently fetched string with the first character of the current string, leading to the "accumulation" of strings in earlier arguments in the output. Fix this by incrementing the return value of strncpy_from_user() by one if we did not hit the maximum buffer size. Link: http://lkml.kernel.org/r/20190116141629.5752-1-andreas.ziegler@fau.de Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 5baaa59e ("tracing/probes: Implement 'memory' fetch method for uprobes") Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NAndreas Ziegler <andreas.ziegler@fau.de> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Jiri Olsa 提交于
commit 81ec3f3c4c4d78f2d3b6689c9816bfbdf7417dbb upstream. Vince (and later on Ravi) reported crashes in the BTS code during fuzzing with the following backtrace: general protection fault: 0000 [#1] SMP PTI ... RIP: 0010:perf_prepare_sample+0x8f/0x510 ... Call Trace: <IRQ> ? intel_pmu_drain_bts_buffer+0x194/0x230 intel_pmu_drain_bts_buffer+0x160/0x230 ? tick_nohz_irq_exit+0x31/0x40 ? smp_call_function_single_interrupt+0x48/0xe0 ? call_function_single_interrupt+0xf/0x20 ? call_function_single_interrupt+0xa/0x20 ? x86_schedule_events+0x1a0/0x2f0 ? x86_pmu_commit_txn+0xb4/0x100 ? find_busiest_group+0x47/0x5d0 ? perf_event_set_state.part.42+0x12/0x50 ? perf_mux_hrtimer_restart+0x40/0xb0 intel_pmu_disable_event+0xae/0x100 ? intel_pmu_disable_event+0xae/0x100 x86_pmu_stop+0x7a/0xb0 x86_pmu_del+0x57/0x120 event_sched_out.isra.101+0x83/0x180 group_sched_out.part.103+0x57/0xe0 ctx_sched_out+0x188/0x240 ctx_resched+0xa8/0xd0 __perf_event_enable+0x193/0x1e0 event_function+0x8e/0xc0 remote_function+0x41/0x50 flush_smp_call_function_queue+0x68/0x100 generic_smp_call_function_single_interrupt+0x13/0x30 smp_call_function_single_interrupt+0x3e/0xe0 call_function_single_interrupt+0xf/0x20 </IRQ> The reason is that while event init code does several checks for BTS events and prevents several unwanted config bits for BTS event (like precise_ip), the PERF_EVENT_IOC_PERIOD allows to create BTS event without those checks being done. Following sequence will cause the crash: If we create an 'almost' BTS event with precise_ip and callchains, and it into a BTS event it will crash the perf_prepare_sample() function because precise_ip events are expected to come in with callchain data initialized, but that's not the case for intel_pmu_drain_bts_buffer() caller. Adding a check_period callback to be called before the period is changed via PERF_EVENT_IOC_PERIOD. It will deny the change if the event would become BTS. Plus adding also the limit_period check as well. Reported-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20190204123532.GA4794@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ingo Molnar 提交于
commit 528871b456026e6127d95b1b2bd8e3a003dc1614 upstream. The following commit: 9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes") results in perf recording failures with larger mmap areas: root@skl:/tmp# perf record -g -a failed to mmap with 12 (Cannot allocate memory) The root cause is that the following condition is buggy: if (order_base_2(size) >= MAX_ORDER) goto fail; The problem is that @size is in bytes and MAX_ORDER is in pages, so the right test is: if (order_base_2(size) >= PAGE_SHIFT+MAX_ORDER) goto fail; Fix it. Reported-by: N"Jin, Yao" <yao.jin@linux.intel.com> Bisected-by: NBorislav Petkov <bp@alien8.de> Analyzed-by: NPeter Zijlstra <peterz@infradead.org> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Fixes: 9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes") Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Xie XiuQi 提交于
hulk inclusion category: feature feature: performance/latency upstream: never bugzilla: 2680,10641 CVE: NA The pervious patch "kmod: run usermodehelpers only on cpus allowed for kthreadd V2" allow to set usermodehelper thread's affinity. add a interface to disable/enable usermodhelper_affinity. Disabled by default. [XQ: backport patch from euleros 2.x: - kmod.c => umh.c ] Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Christoph Lameter 提交于
hulk inclusion category: feature feature: performance/latency upstream: never bugzilla: 2680,10641 CVE: NA isolate this usermodehelper kernel threads to other cpus, to avoid the latency issue. With this patch, the usermodehelper thread could inherit kthreadd's affinity. For example, if you want to isolate usermodehelper to cpu 1: 1) taskset -cp 1 2 # bind kthreadd task (pid = 2) to cpu 1 2) trigger call usermodhelper threads --------------------------------------------- usermodehelper() threads can currently run on all processors. This is an issue for low latency cores. Spawnig a new thread causes cpu holdoffs in the range of hundreds of microseconds to a few milliseconds. Not good for cores on which processes run that need to react as fast as possible. kthreadd threads can be restricted using taskset to a limited set of processors. Then the kernel thread pool will not fork processes on those anymore thereby protecting those processors from additional latencies. Make usermodehelper() threads obey the limitations that kthreadd is restricted to. Kthreadd is not the parent of usermodehelper threads so we need to explicitly get the allowed processors for kthreadd. Before this patch there is no way to limit the cpus that usermodehelper can run on since the affinity is set when the thread is spawned to all processors. [akpm@linux-foundation.org: set_cpus_allowed() doesn't exist when CONFIG_CPUMASK_OFFSTACK=y] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NChristoph Lameter <cl@linux.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <bitbucket@online.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://patchwork.kernel.org/patch/3153671/Reported-and-tested-by: NXiangyou Xie <xiexiangyou@huawei.com> [ 1) kmod.c => umh.c 2) ____call_usermodehelper => call_usermodehelper_exec_async ] Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 zhongjiang 提交于
euler inclusion category: bugfix CVE: NA Bugzilla: 9580 --------------------------- The patch control the open/close of the feature by the /proc/sys/vm/cache_reclaim_enable, hence we can completely close all background thread for user demand. Signed-off-by: Nzhongjiang <zhongjiang@huawei.com> Reviewed-by: NJing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 zhongjiang 提交于
euler inclusion category: bugfix CVE: NA Bugzilla: 9580 --------------------------- Just add Kconfig to the feature. Signed-off-by: Nzhongjiang <zhongjiang@huawei.com> Reviewed-by: NJing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alexei Starovoitov 提交于
mainline inclusion from mainline-5.0 commit a89fac57b5d0 category: bugfix bugzilla: 9352 CVE: NA ------------------------------------------------- Lockdep warns about false positive: [ 12.492084] 00000000e6b28347 (&head->lock){+...}, at: pcpu_freelist_push+0x2a/0x40 [ 12.492696] but this lock was taken by another, HARDIRQ-safe lock in the past: [ 12.493275] (&rq->lock){-.-.} [ 12.493276] [ 12.493276] [ 12.493276] and interrupts could create inverse lock ordering between them. [ 12.493276] [ 12.494435] [ 12.494435] other info that might help us debug this: [ 12.494979] Possible interrupt unsafe locking scenario: [ 12.494979] [ 12.495518] CPU0 CPU1 [ 12.495879] ---- ---- [ 12.496243] lock(&head->lock); [ 12.496502] local_irq_disable(); [ 12.496969] lock(&rq->lock); [ 12.497431] lock(&head->lock); [ 12.497890] <Interrupt> [ 12.498104] lock(&rq->lock); [ 12.498368] [ 12.498368] *** DEADLOCK *** [ 12.498368] [ 12.498837] 1 lock held by dd/276: [ 12.499110] #0: 00000000c58cb2ee (rcu_read_lock){....}, at: trace_call_bpf+0x5e/0x240 [ 12.499747] [ 12.499747] the shortest dependencies between 2nd lock and 1st lock: [ 12.500389] -> (&rq->lock){-.-.} { [ 12.500669] IN-HARDIRQ-W at: [ 12.500934] _raw_spin_lock+0x2f/0x40 [ 12.501373] scheduler_tick+0x4c/0xf0 [ 12.501812] update_process_times+0x40/0x50 [ 12.502294] tick_periodic+0x27/0xb0 [ 12.502723] tick_handle_periodic+0x1f/0x60 [ 12.503203] timer_interrupt+0x11/0x20 [ 12.503651] __handle_irq_event_percpu+0x43/0x2c0 [ 12.504167] handle_irq_event_percpu+0x20/0x50 [ 12.504674] handle_irq_event+0x37/0x60 [ 12.505139] handle_level_irq+0xa7/0x120 [ 12.505601] handle_irq+0xa1/0x150 [ 12.506018] do_IRQ+0x77/0x140 [ 12.506411] ret_from_intr+0x0/0x1d [ 12.506834] _raw_spin_unlock_irqrestore+0x53/0x60 [ 12.507362] __setup_irq+0x481/0x730 [ 12.507789] setup_irq+0x49/0x80 [ 12.508195] hpet_time_init+0x21/0x32 [ 12.508644] x86_late_time_init+0xb/0x16 [ 12.509106] start_kernel+0x390/0x42a [ 12.509554] secondary_startup_64+0xa4/0xb0 [ 12.510034] IN-SOFTIRQ-W at: [ 12.510305] _raw_spin_lock+0x2f/0x40 [ 12.510772] try_to_wake_up+0x1c7/0x4e0 [ 12.511220] swake_up_locked+0x20/0x40 [ 12.511657] swake_up_one+0x1a/0x30 [ 12.512070] rcu_process_callbacks+0xc5/0x650 [ 12.512553] __do_softirq+0xe6/0x47b [ 12.512978] irq_exit+0xc3/0xd0 [ 12.513372] smp_apic_timer_interrupt+0xa9/0x250 [ 12.513876] apic_timer_interrupt+0xf/0x20 [ 12.514343] default_idle+0x1c/0x170 [ 12.514765] do_idle+0x199/0x240 [ 12.515159] cpu_startup_entry+0x19/0x20 [ 12.515614] start_kernel+0x422/0x42a [ 12.516045] secondary_startup_64+0xa4/0xb0 [ 12.516521] INITIAL USE at: [ 12.516774] _raw_spin_lock_irqsave+0x38/0x50 [ 12.517258] rq_attach_root+0x16/0xd0 [ 12.517685] sched_init+0x2f2/0x3eb [ 12.518096] start_kernel+0x1fb/0x42a [ 12.518525] secondary_startup_64+0xa4/0xb0 [ 12.518986] } [ 12.519132] ... key at: [<ffffffff82b7bc28>] __key.71384+0x0/0x8 [ 12.519649] ... acquired at: [ 12.519892] pcpu_freelist_pop+0x7b/0xd0 [ 12.520221] bpf_get_stackid+0x1d2/0x4d0 [ 12.520563] ___bpf_prog_run+0x8b4/0x11a0 [ 12.520887] [ 12.521008] -> (&head->lock){+...} { [ 12.521292] HARDIRQ-ON-W at: [ 12.521539] _raw_spin_lock+0x2f/0x40 [ 12.521950] pcpu_freelist_push+0x2a/0x40 [ 12.522396] bpf_get_stackid+0x494/0x4d0 [ 12.522828] ___bpf_prog_run+0x8b4/0x11a0 [ 12.523296] INITIAL USE at: [ 12.523537] _raw_spin_lock+0x2f/0x40 [ 12.523944] pcpu_freelist_populate+0xc0/0x120 [ 12.524417] htab_map_alloc+0x405/0x500 [ 12.524835] __do_sys_bpf+0x1a3/0x1a90 [ 12.525253] do_syscall_64+0x4a/0x180 [ 12.525659] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 12.526167] } [ 12.526311] ... key at: [<ffffffff838f7668>] __key.13130+0x0/0x8 [ 12.526812] ... acquired at: [ 12.527047] __lock_acquire+0x521/0x1350 [ 12.527371] lock_acquire+0x98/0x190 [ 12.527680] _raw_spin_lock+0x2f/0x40 [ 12.527994] pcpu_freelist_push+0x2a/0x40 [ 12.528325] bpf_get_stackid+0x494/0x4d0 [ 12.528645] ___bpf_prog_run+0x8b4/0x11a0 [ 12.528970] [ 12.529092] [ 12.529092] stack backtrace: [ 12.529444] CPU: 0 PID: 276 Comm: dd Not tainted 5.0.0-rc3-00018-g2fa53f892422 #475 [ 12.530043] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 [ 12.530750] Call Trace: [ 12.530948] dump_stack+0x5f/0x8b [ 12.531248] check_usage_backwards+0x10c/0x120 [ 12.531598] ? ___bpf_prog_run+0x8b4/0x11a0 [ 12.531935] ? mark_lock+0x382/0x560 [ 12.532229] mark_lock+0x382/0x560 [ 12.532496] ? print_shortest_lock_dependencies+0x180/0x180 [ 12.532928] __lock_acquire+0x521/0x1350 [ 12.533271] ? find_get_entry+0x17f/0x2e0 [ 12.533586] ? find_get_entry+0x19c/0x2e0 [ 12.533902] ? lock_acquire+0x98/0x190 [ 12.534196] lock_acquire+0x98/0x190 [ 12.534482] ? pcpu_freelist_push+0x2a/0x40 [ 12.534810] _raw_spin_lock+0x2f/0x40 [ 12.535099] ? pcpu_freelist_push+0x2a/0x40 [ 12.535432] pcpu_freelist_push+0x2a/0x40 [ 12.535750] bpf_get_stackid+0x494/0x4d0 [ 12.536062] ___bpf_prog_run+0x8b4/0x11a0 It has been explained that is a false positive here: https://lkml.org/lkml/2018/7/25/756 Recap: - stackmap uses pcpu_freelist - The lock in pcpu_freelist is a percpu lock - stackmap is only used by tracing bpf_prog - A tracing bpf_prog cannot be run if another bpf_prog has already been running (ensured by the percpu bpf_prog_active counter). Eric pointed out that this lockdep splats stops other legit lockdep splats in selftests/bpf/test_progs.c. Fix this by calling local_irq_save/restore for stackmap. Another false positive had also been worked around by calling local_irq_save in commit 89ad2fa3 ("bpf: fix lockdep splat"). That commit added unnecessary irq_save/restore to fast path of bpf hash map. irqs are already disabled at that point, since htab is holding per bucket spin_lock with irqsave. Let's reduce overhead for htab by introducing __pcpu_freelist_push/pop function w/o irqsave and convert pcpu_freelist_push/pop to irqsave to be used elsewhere (right now only in stackmap). It stops lockdep false positive in stackmap with a bit of acceptable overhead. Fixes: 557c0c6e ("bpf: convert stackmap to pre-allocation") Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Martin KaFai Lau 提交于
mainline inclusion from mainline-5.0 commit 7c4cd051add3 category: bugfix bugzilla: 9355 CVE: NA ------------------------------------------------- The map_lookup_elem used to not acquiring spinlock in order to optimize the reader. It was true until commit 557c0c6e ("bpf: convert stackmap to pre-allocation") The syscall's map_lookup_elem(stackmap) calls bpf_stackmap_copy(). bpf_stackmap_copy() may find the elem no longer needed after the copy is done. If that is the case, pcpu_freelist_push() saves this elem for reuse later. This push requires a spinlock. If a tracing bpf_prog got run in the middle of the syscall's map_lookup_elem(stackmap) and this tracing bpf_prog is calling bpf_get_stackid(stackmap) which also requires the same pcpu_freelist's spinlock, it may end up with a dead lock situation as reported by Eric Dumazet in https://patchwork.ozlabs.org/patch/1030266/ The situation is the same as the syscall's map_update_elem() which needs to acquire the pcpu_freelist's spinlock and could race with tracing bpf_prog. Hence, this patch fixes it by protecting bpf_stackmap_copy() with this_cpu_inc(bpf_prog_active) to prevent tracing bpf_prog from running. A later syscall's map_lookup_elem commit f1a2e44a3aec ("bpf: add queue and stack maps") also acquires a spinlock and races with tracing bpf_prog similarly. Hence, this patch is forward looking and protects the majority of the map lookups. bpf_map_offload_lookup_elem() is the exception since it is for network bpf_prog only (i.e. never called by tracing bpf_prog). Fixes: 557c0c6e ("bpf: convert stackmap to pre-allocation") Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Prashant Bhole 提交于
mainline inclusion from mainline-4.20 commit 509db2833e0d category: bugfix bugzilla: 9355 CVE: NA ------------------------------------------------- The error value returned by map_lookup_elem doesn't differentiate whether lookup was failed because of invalid key or lookup is not supported. Lets add handling for -EOPNOTSUPP return value of map_lookup_elem() method of map, with expectation from map's implementation that it should return -EOPNOTSUPP if lookup is not supported. The errno for bpf syscall for BPF_MAP_LOOKUP_ELEM command will be set to EOPNOTSUPP if map lookup is not supported. Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alexei Starovoitov 提交于
mainline inclusion from mainline-5.0 commit e16ec34039c7 category: bugfix bugzilla: 9347 CVE: NA ------------------------------------------------- Lockdep found a potential deadlock between cpu_hotplug_lock, bpf_event_mutex, and cpuctx_mutex: [ 13.007000] WARNING: possible circular locking dependency detected [ 13.007587] 5.0.0-rc3-00018-g2fa53f892422-dirty #477 Not tainted [ 13.008124] ------------------------------------------------------ [ 13.008624] test_progs/246 is trying to acquire lock: [ 13.009030] 0000000094160d1d (tracepoints_mutex){+.+.}, at: tracepoint_probe_register_prio+0x2d/0x300 [ 13.009770] [ 13.009770] but task is already holding lock: [ 13.010239] 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60 [ 13.010877] [ 13.010877] which lock already depends on the new lock. [ 13.010877] [ 13.011532] [ 13.011532] the existing dependency chain (in reverse order) is: [ 13.012129] [ 13.012129] -> #4 (bpf_event_mutex){+.+.}: [ 13.012582] perf_event_query_prog_array+0x9b/0x130 [ 13.013016] _perf_ioctl+0x3aa/0x830 [ 13.013354] perf_ioctl+0x2e/0x50 [ 13.013668] do_vfs_ioctl+0x8f/0x6a0 [ 13.014003] ksys_ioctl+0x70/0x80 [ 13.014320] __x64_sys_ioctl+0x16/0x20 [ 13.014668] do_syscall_64+0x4a/0x180 [ 13.015007] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.015469] [ 13.015469] -> #3 (&cpuctx_mutex){+.+.}: [ 13.015910] perf_event_init_cpu+0x5a/0x90 [ 13.016291] perf_event_init+0x1b2/0x1de [ 13.016654] start_kernel+0x2b8/0x42a [ 13.016995] secondary_startup_64+0xa4/0xb0 [ 13.017382] [ 13.017382] -> #2 (pmus_lock){+.+.}: [ 13.017794] perf_event_init_cpu+0x21/0x90 [ 13.018172] cpuhp_invoke_callback+0xb3/0x960 [ 13.018573] _cpu_up+0xa7/0x140 [ 13.018871] do_cpu_up+0xa4/0xc0 [ 13.019178] smp_init+0xcd/0xd2 [ 13.019483] kernel_init_freeable+0x123/0x24f [ 13.019878] kernel_init+0xa/0x110 [ 13.020201] ret_from_fork+0x24/0x30 [ 13.020541] [ 13.020541] -> #1 (cpu_hotplug_lock.rw_sem){++++}: [ 13.021051] static_key_slow_inc+0xe/0x20 [ 13.021424] tracepoint_probe_register_prio+0x28c/0x300 [ 13.021891] perf_trace_event_init+0x11f/0x250 [ 13.022297] perf_trace_init+0x6b/0xa0 [ 13.022644] perf_tp_event_init+0x25/0x40 [ 13.023011] perf_try_init_event+0x6b/0x90 [ 13.023386] perf_event_alloc+0x9a8/0xc40 [ 13.023754] __do_sys_perf_event_open+0x1dd/0xd30 [ 13.024173] do_syscall_64+0x4a/0x180 [ 13.024519] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.024968] [ 13.024968] -> #0 (tracepoints_mutex){+.+.}: [ 13.025434] __mutex_lock+0x86/0x970 [ 13.025764] tracepoint_probe_register_prio+0x2d/0x300 [ 13.026215] bpf_probe_register+0x40/0x60 [ 13.026584] bpf_raw_tracepoint_open.isra.34+0xa4/0x130 [ 13.027042] __do_sys_bpf+0x94f/0x1a90 [ 13.027389] do_syscall_64+0x4a/0x180 [ 13.027727] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.028171] [ 13.028171] other info that might help us debug this: [ 13.028171] [ 13.028807] Chain exists of: [ 13.028807] tracepoints_mutex --> &cpuctx_mutex --> bpf_event_mutex [ 13.028807] [ 13.029666] Possible unsafe locking scenario: [ 13.029666] [ 13.030140] CPU0 CPU1 [ 13.030510] ---- ---- [ 13.030875] lock(bpf_event_mutex); [ 13.031166] lock(&cpuctx_mutex); [ 13.031645] lock(bpf_event_mutex); [ 13.032135] lock(tracepoints_mutex); [ 13.032441] [ 13.032441] *** DEADLOCK *** [ 13.032441] [ 13.032911] 1 lock held by test_progs/246: [ 13.033239] #0: 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60 [ 13.033909] [ 13.033909] stack backtrace: [ 13.034258] CPU: 1 PID: 246 Comm: test_progs Not tainted 5.0.0-rc3-00018-g2fa53f892422-dirty #477 [ 13.034964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 [ 13.035657] Call Trace: [ 13.035859] dump_stack+0x5f/0x8b [ 13.036130] print_circular_bug.isra.37+0x1ce/0x1db [ 13.036526] __lock_acquire+0x1158/0x1350 [ 13.036852] ? lock_acquire+0x98/0x190 [ 13.037154] lock_acquire+0x98/0x190 [ 13.037447] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.037876] __mutex_lock+0x86/0x970 [ 13.038167] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.038600] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.039028] ? __mutex_lock+0x86/0x970 [ 13.039337] ? __mutex_lock+0x24a/0x970 [ 13.039649] ? bpf_probe_register+0x1d/0x60 [ 13.039992] ? __bpf_trace_sched_wake_idle_without_ipi+0x10/0x10 [ 13.040478] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.040906] tracepoint_probe_register_prio+0x2d/0x300 [ 13.041325] bpf_probe_register+0x40/0x60 [ 13.041649] bpf_raw_tracepoint_open.isra.34+0xa4/0x130 [ 13.042068] ? __might_fault+0x3e/0x90 [ 13.042374] __do_sys_bpf+0x94f/0x1a90 [ 13.042678] do_syscall_64+0x4a/0x180 [ 13.042975] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.043382] RIP: 0033:0x7f23b10a07f9 [ 13.045155] RSP: 002b:00007ffdef42fdd8 EFLAGS: 00000202 ORIG_RAX: 0000000000000141 [ 13.045759] RAX: ffffffffffffffda RBX: 00007ffdef42ff70 RCX: 00007f23b10a07f9 [ 13.046326] RDX: 0000000000000070 RSI: 00007ffdef42fe10 RDI: 0000000000000011 [ 13.046893] RBP: 00007ffdef42fdf0 R08: 0000000000000038 R09: 00007ffdef42fe10 [ 13.047462] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 [ 13.048029] R13: 0000000000000016 R14: 00007f23b1db4690 R15: 0000000000000000 Since tracepoints_mutex will be taken in tracepoint_probe_register/unregister() there is no need to take bpf_event_mutex too. bpf_event_mutex is protecting modifications to prog array used in kprobe/perf bpf progs. bpf_raw_tracepoints don't need to take this mutex. Fixes: c4f6699d ("bpf: introduce BPF_RAW_TRACEPOINT") Acked-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Matt Mullins 提交于
mainline inclusion from mainline-5.0 commit a38d1107f937 category: bugfix bugzilla: 9347 CVE: NA ------------------------------------------------- Distributions build drivers as modules, including network and filesystem drivers which export numerous tracepoints. This enables bpf(BPF_RAW_TRACEPOINT_OPEN) to attach to those tracepoints. Signed-off-by: NMatt Mullins <mmullins@fb.com> Acked-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Zhang Xiaoxu 提交于
euler inclusion category: bugfix Bugzilla: 5380 CVE: N/A ---------------------------------------- Syzkaller report UBSAN bug: UBSAN: Undefined behaviour in kernel/time/timekeeping.c:98:17 signed integer overflow: 8589935550743139462 + 2147483647000000000 cannot be represented in type 'long long int' Use add_time_safe instead add_time in tk_set_wall_to_mono. Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 zhangyi (F) 提交于
euler inclusion category: bugfix bugzilla: 9292 CVE: NA --------------------------- Commit d716ff71 ("tracing: Remove taking of trace_types_lock in pipe files") use the current tracer instead of the copy in tracing_open_pipe(), but it forget to remove the freeing sentence in the error path. Fixes: d716ff71 ("tracing: Remove taking of trace_types_lock in pipe files") Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Sergey Senozhatsky 提交于
euler inclusion category: bugfix bugzilla: 9509 CVE: NA ------------------------------------------------- Make printk_safe_enter_irqsave()/etc macros available to the rest of the kernel. Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-