- 27 12月, 2019 31 次提交
-
-
由 Petr Mladek 提交于
commit d5b844a2cf507fc7642c9ae80a9d585db3065c28 upstream. The commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") causes a possible deadlock between register_kprobe() and ftrace_run_update_code() when ftrace is using stop_machine(). The existing dependency chain (in reverse order) is: -> #1 (text_mutex){+.+.}: validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 __mutex_lock+0x88/0x908 mutex_lock_nested+0x32/0x40 register_kprobe+0x254/0x658 init_kprobes+0x11a/0x168 do_one_initcall+0x70/0x318 kernel_init_freeable+0x456/0x508 kernel_init+0x22/0x150 ret_from_fork+0x30/0x34 kernel_thread_starter+0x0/0xc -> #0 (cpu_hotplug_lock.rw_sem){++++}: check_prev_add+0x90c/0xde0 validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 cpus_read_lock+0x62/0xd0 stop_machine+0x2e/0x60 arch_ftrace_update_code+0x2e/0x40 ftrace_run_update_code+0x40/0xa0 ftrace_startup+0xb2/0x168 register_ftrace_function+0x64/0x88 klp_patch_object+0x1a2/0x290 klp_enable_patch+0x554/0x980 do_one_initcall+0x70/0x318 do_init_module+0x6e/0x250 load_module+0x1782/0x1990 __s390x_sys_finit_module+0xaa/0xf0 system_call+0xd8/0x2d0 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); It is similar problem that has been solved by the commit 2d1e38f5 ("kprobes: Cure hotplug lock ordering issues"). Many locks are involved. To be on the safe side, text_mutex must become a low level lock taken after cpu_hotplug_lock.rw_sem. This can't be achieved easily with the current ftrace design. For example, arm calls set_all_modules_text_rw() already in ftrace_arch_code_modify_prepare(), see arch/arm/kernel/ftrace.c. This functions is called: + outside stop_machine() from ftrace_run_update_code() + without stop_machine() from ftrace_module_enable() Fortunately, the problematic fix is needed only on x86_64. It is the only architecture that calls set_all_modules_text_rw() in ftrace path and supports livepatching at the same time. Therefore it is enough to move text_mutex handling from the generic kernel/trace/ftrace.c into arch/x86/kernel/ftrace.c: ftrace_arch_code_modify_prepare() ftrace_arch_code_modify_post_process() This patch basically reverts the ftrace part of the problematic commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race"). And provides x86_64 specific-fix. Some refactoring of the ftrace code will be needed when livepatching is implemented for arm or nds32. These architectures call set_all_modules_text_rw() and use stop_machine() at the same time. Link: http://lkml.kernel.org/r/20190627081334.12793-1-pmladek@suse.com Fixes: 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") Acked-by: NThomas Gleixner <tglx@linutronix.de> Reported-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NPetr Mladek <pmladek@suse.com> [ As reviewed by Miroslav Benes <mbenes@suse.cz>, removed return value of ftrace_run_update_code() as it is a void function. ] Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Eiichi Tsukata 提交于
commit 46cc0b44428d0f0e81f11ea98217fc0edfbeab07 upstream. Current snapshot implementation swaps two ring_buffers even though their sizes are different from each other, that can cause an inconsistency between the contents of buffer_size_kb file and the current buffer size. For example: # cat buffer_size_kb 7 (expanded: 1408) # echo 1 > events/enable # grep bytes per_cpu/cpu0/stats bytes: 1441020 # echo 1 > snapshot // current:1408, spare:1408 # echo 123 > buffer_size_kb // current:123, spare:1408 # echo 1 > snapshot // current:1408, spare:123 # grep bytes per_cpu/cpu0/stats bytes: 1443700 # cat buffer_size_kb 123 // != current:1408 And also, a similar per-cpu case hits the following WARNING: Reproducer: # echo 1 > per_cpu/cpu0/snapshot # echo 123 > buffer_size_kb # echo 1 > per_cpu/cpu0/snapshot WARNING: WARNING: CPU: 0 PID: 1946 at kernel/trace/trace.c:1607 update_max_tr_single.part.0+0x2b8/0x380 Modules linked in: CPU: 0 PID: 1946 Comm: bash Not tainted 5.2.0-rc6 #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 RIP: 0010:update_max_tr_single.part.0+0x2b8/0x380 Code: ff e8 dc da f9 ff 0f 0b e9 88 fe ff ff e8 d0 da f9 ff 44 89 ee bf f5 ff ff ff e8 33 dc f9 ff 41 83 fd f5 74 96 e8 b8 da f9 ff <0f> 0b eb 8d e8 af da f9 ff 0f 0b e9 bf fd ff ff e8 a3 da f9 ff 48 RSP: 0018:ffff888063e4fca0 EFLAGS: 00010093 RAX: ffff888066214380 RBX: ffffffff99850fe0 RCX: ffffffff964298a8 RDX: 0000000000000000 RSI: 00000000fffffff5 RDI: 0000000000000005 RBP: 1ffff1100c7c9f96 R08: ffff888066214380 R09: ffffed100c7c9f9b R10: ffffed100c7c9f9a R11: 0000000000000003 R12: 0000000000000000 R13: 00000000ffffffea R14: ffff888066214380 R15: ffffffff99851060 FS: 00007f9f8173c700(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000714dc0 CR3: 0000000066fa6000 CR4: 00000000000006f0 Call Trace: ? trace_array_printk_buf+0x140/0x140 ? __mutex_lock_slowpath+0x10/0x10 tracing_snapshot_write+0x4c8/0x7f0 ? trace_printk_init_buffers+0x60/0x60 ? selinux_file_permission+0x3b/0x540 ? tracer_preempt_off+0x38/0x506 ? trace_printk_init_buffers+0x60/0x60 __vfs_write+0x81/0x100 vfs_write+0x1e1/0x560 ksys_write+0x126/0x250 ? __ia32_sys_read+0xb0/0xb0 ? do_syscall_64+0x1f/0x390 do_syscall_64+0xc1/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe This patch adds resize_buffer_duplicate_size() to check if there is a difference between current/spare buffer sizes and resize a spare buffer if necessary. Link: http://lkml.kernel.org/r/20190625012910.13109-1-devel@etsukata.com Cc: stable@vger.kernel.org Fixes: ad909e21 ("tracing: Add internal tracing_snapshot() functions") Signed-off-by: NEiichi Tsukata <devel@etsukata.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Josh Poimboeuf 提交于
[ Upstream commit 9f255b632bf12c4dd7fc31caee89aa991ef75176 ] It's possible for livepatch and ftrace to be toggling a module's text permissions at the same time, resulting in the following panic: BUG: unable to handle page fault for address: ffffffffc005b1d9 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 3ea0c067 P4D 3ea0c067 PUD 3ea0e067 PMD 3cc13067 PTE 3b8a1061 Oops: 0003 [#1] PREEMPT SMP PTI CPU: 1 PID: 453 Comm: insmod Tainted: G O K 5.2.0-rc1-a188339ca5 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014 RIP: 0010:apply_relocate_add+0xbe/0x14c Code: fa 0b 74 21 48 83 fa 18 74 38 48 83 fa 0a 75 40 eb 08 48 83 38 00 74 33 eb 53 83 38 00 75 4e 89 08 89 c8 eb 0a 83 38 00 75 43 <89> 08 48 63 c1 48 39 c8 74 2e eb 48 83 38 00 75 32 48 29 c1 89 08 RSP: 0018:ffffb223c00dbb10 EFLAGS: 00010246 RAX: ffffffffc005b1d9 RBX: 0000000000000000 RCX: ffffffff8b200060 RDX: 000000000000000b RSI: 0000004b0000000b RDI: ffff96bdfcd33000 RBP: ffffb223c00dbb38 R08: ffffffffc005d040 R09: ffffffffc005c1f0 R10: ffff96bdfcd33c40 R11: ffff96bdfcd33b80 R12: 0000000000000018 R13: ffffffffc005c1f0 R14: ffffffffc005e708 R15: ffffffff8b2fbc74 FS: 00007f5f447beba8(0000) GS:ffff96bdff900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffc005b1d9 CR3: 000000003cedc002 CR4: 0000000000360ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: klp_init_object_loaded+0x10f/0x219 ? preempt_latency_start+0x21/0x57 klp_enable_patch+0x662/0x809 ? virt_to_head_page+0x3a/0x3c ? kfree+0x8c/0x126 patch_init+0x2ed/0x1000 [livepatch_test02] ? 0xffffffffc0060000 do_one_initcall+0x9f/0x1c5 ? kmem_cache_alloc_trace+0xc4/0xd4 ? do_init_module+0x27/0x210 do_init_module+0x5f/0x210 load_module+0x1c41/0x2290 ? fsnotify_path+0x3b/0x42 ? strstarts+0x2b/0x2b ? kernel_read+0x58/0x65 __do_sys_finit_module+0x9f/0xc3 ? __do_sys_finit_module+0x9f/0xc3 __x64_sys_finit_module+0x1a/0x1c do_syscall_64+0x52/0x61 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The above panic occurs when loading two modules at the same time with ftrace enabled, where at least one of the modules is a livepatch module: CPU0 CPU1 klp_enable_patch() klp_init_object_loaded() module_disable_ro() ftrace_module_enable() ftrace_arch_code_modify_post_process() set_all_modules_text_ro() klp_write_object_relocations() apply_relocate_add() *patches read-only code* - BOOM A similar race exists when toggling ftrace while loading a livepatch module. Fix it by ensuring that the livepatch and ftrace code patching operations -- and their respective permissions changes -- are protected by the text_mutex. Link: http://lkml.kernel.org/r/ab43d56ab909469ac5d2520c5d944ad6d4abd476.1560474114.git.jpoimboe@redhat.comReported-by: NJohannes Erdfelt <johannes@erdfelt.com> Fixes: 444d13ff ("modules: add ro_after_init support") Acked-by: NJessica Yu <jeyu@kernel.org> Reviewed-by: NPetr Mladek <pmladek@suse.com> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Vasily Gorbik 提交于
[ Upstream commit cbdaeaf050b730ea02e9ab4ff844ce54d85dbe1d ] Selecting HAVE_NOP_MCOUNT enables -mnop-mcount (if gcc supports it) and sets CC_USING_NOP_MCOUNT. Reuse __is_defined (which is suitable for testing CC_USING_* defines) to avoid conditional compilation and fix the following gcc 9 warning on s390: kernel/trace/ftrace.c:2514:1: warning: ‘ftrace_code_disable’ defined but not used [-Wunused-function] Link: http://lkml.kernel.org/r/patch.git-1a82d13f33ac.your-ad-here.call-01559732716-ext-6629@work.hours Fixes: 2f4df001 ("tracing: Add -mcount-nop option support") Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Matt Mullins 提交于
commit 9594dc3c7e71b9f52bee1d7852eb3d4e3aea9e99 upstream. BPF_PROG_TYPE_RAW_TRACEPOINTs can be executed nested on the same CPU, as they do not increment bpf_prog_active while executing. This enables three levels of nesting, to support - a kprobe or raw tp or perf event, - another one of the above that irq context happens to call, and - another one in nmi context (at most one of which may be a kprobe or perf event). Fixes: 20b9d7ac ("bpf: avoid excessive stack usage for perf_sample_data") Signed-off-by: NMatt Mullins <mmullins@fb.com> Acked-by: NAndrii Nakryiko <andriin@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Sasha Levin 提交于
This reverts commit 1a3188d737ceb922166d8fe78a5fc4f89907e31b, which was upstream commit 4a6c91fbdef846ec7250b82f2eeeb87ac5f18cf9. On Tue, Jun 25, 2019 at 09:39:45AM +0200, Sebastian Andrzej Siewior wrote: >Please backport commit e74deb11931ff682b59d5b9d387f7115f689698e to >stable _or_ revert the backport of commit 4a6c91fbdef84 ("x86/uaccess, >ftrace: Fix ftrace_likely_update() vs. SMAP"). It uses >user_access_{save|restore}() which has been introduced in the following >commit. Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Miguel Ojeda 提交于
commit 0c97bf863efce63d6ab7971dad811601e6171d2f upstream. Starting with GCC 9, -Warray-bounds detects cases when memset is called starting on a member of a struct but the size to be cleared ends up writing over further members. Such a call happens in the trace code to clear, at once, all members after and including `seq` on struct trace_iterator: In function 'memset', inlined from 'ftrace_dump' at kernel/trace/trace.c:8914:3: ./include/linux/string.h:344:9: warning: '__builtin_memset' offset [8505, 8560] from the object at 'iter' is out of the bounds of referenced subobject 'seq' with type 'struct trace_seq' at offset 4368 [-Warray-bounds] 344 | return __builtin_memset(p, c, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to avoid GCC complaining about it, we compute the address ourselves by adding the offsetof distance instead of referring directly to the member. Since there are two places doing this clear (trace.c and trace_kdb.c), take the chance to move the workaround into a single place in the internal header. Link: http://lkml.kernel.org/r/20190523124535.GA12931@gmail.comSigned-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com> [ Removed unnecessary parenthesis around "iter" ] Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Tom Zanussi 提交于
[ Upstream commit 55267c88c003a3648567beae7c90512d3e2ab15e ] hist_field_var_ref() is an implementation of hist_field_fn_t(), which can be called with a null tracing_map_elt elt param when assembling a key in event_hist_trigger(). In the case of hist_field_var_ref() this doesn't make sense, because a variable can only be resolved by looking it up using an already assembled key i.e. a variable can't be used to assemble a key since the key is required in order to access the variable. Upper layers should prevent the user from constructing a key using a variable in the first place, but in case one slips through, it shouldn't cause a NULL pointer dereference. Also if one does slip through, we want to know about it, so emit a one-time warning in that case. Link: http://lkml.kernel.org/r/64ec8dc15c14d305295b64cdfcc6b2b9dd14753f.1555597045.git.tom.zanussi@linux.intel.comReported-by: NVincent Bernat <vincent@bernat.ch> Signed-off-by: NTom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tomas Bortoli 提交于
commit dfb4a6f2191a80c8b790117d0ff592fd712d3296 upstream. In case of errors, predicate_parse() goes to the out_free label to free memory and to return an error code. However, predicate_parse() does not free the predicates of the temporary prog_stack array, thence leaking them. Link: http://lkml.kernel.org/r/20190528154338.29976-1-tomasbortoli@gmail.com Cc: stable@vger.kernel.org Fixes: 80765597 ("tracing: Rewrite filter logic to be simpler and faster") Reported-by: syzbot+6b8e0fb820e570c59e19@syzkaller.appspotmail.com Signed-off-by: NTomas Bortoli <tomasbortoli@gmail.com> [ Added protection around freeing prog_stack[i].pred ] Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Wei Li 提交于
hulk inclusion category: bugfix bugzilla: 16533 CVE: NA ------------------------------------------------- The mapper may be NULL when called from register_ftrace_function_probe() with probe->data == NULL. This issue can be reproduced as follow (it may be coverd by compiler optimization sometime): / # cat /sys/kernel/debug/tracing/set_ftrace_filter #### all functions enabled #### /sys/kernel/debug/tracing # echo do_trap:dump > set_ftrace_filter [ 559.030635] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 559.034017] Mem abort info: [ 559.034346] ESR = 0x96000006 [ 559.035154] Exception class = DABT (current EL), IL = 32 bits [ 559.038036] SET = 0, FnV = 0 [ 559.038403] EA = 0, S1PTW = 0 [ 559.041292] Data abort info: [ 559.041697] ISV = 0, ISS = 0x00000006 [ 559.042081] CM = 0, WnR = 0 [ 559.042655] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____) [ 559.043125] [0000000000000000] pgd=0000000429dc5003, pud=0000000429dc6003, pmd=0000000000000000 [ 559.046981] Internal error: Oops: 96000006 [#1] SMP [ 559.050625] Dumping ftrace buffer: [ 559.053384] (ftrace buffer empty) Entering kdb (current=0xffff8003e8c457c0, pid 232) on processor 6 Oops: (null) due to oops @ 0xffff000008370f14 CPU: 6 PID: 232 Comm: sh Not tainted 4.19.39+ #29 Hardware name: linux,dummy-virt (DT) pstate: 60000005 (nZCv daif -PAN -UAO) pc : free_ftrace_func_mapper+0x2c/0x118 lr : ftrace_count_free+0x68/0x80 sp : ffff00000eceba30 x29: ffff00000eceba30 x28: ffff8003e8d61480 x27: ffff00000adeedb0 x26: 0000000000000001 x25: ffff00000b5d0000 x24: 0000000000000000 x23: ffff00000b1ea000 x22: ffff00000adee000 x21: 0000000000000000 x20: ffff00000b5d05e8 x19: ffff00000b5e2e90 x18: ffff8003e8ee0380 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: ffff000009f63400 x13: 000000000026def4 x12: 0000000000002200 x11: 0000000000000000 x10: 5a5a5a5a5a5a5a5a x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff00000ad39748 x5 : ffff0000083a4cf8 x4 : 0000000000000001 x3 : 0000000000000001 x2 : ffff00000b5e2c88 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: free_ftrace_func_mapper+0x2c/0x118 ftrace_count_free+0x68/0x80 release_probe+0xfc/0x1d0 register_ftrace_function_probe+0x4b0/0x870 ftrace_trace_probe_callback.isra.4+0xb8/0x180 ftrace_dump_callback+0x50/0x70 ftrace_regex_write.isra.30+0x290/0x3a8 ftrace_filter_write+0x44/0x60 __vfs_write+0x78/0x320 vfs_write+0x14c/0x2d8 ksys_write+0xa0/0x168 __arm64_sys_write+0x3c/0x58 el0_svc_common+0x41c/0x610 el0_svc_handler+0x148/0x1d0 el0_svc+0x8/0xc Signed-off-by: NWei Li <liwei391@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Peter Zijlstra 提交于
[ Upstream commit 4a6c91fbdef846ec7250b82f2eeeb87ac5f18cf9 ] For CONFIG_TRACE_BRANCH_PROFILING=y the likely/unlikely things get overloaded and generate callouts to this code, and thus also when AC=1. Make it safe. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Elazar Leibovich 提交于
commit cbe08bcbbe787315c425dde284dcb715cfbf3f39 upstream. When reading only part of the id file, the ppos isn't tracked correctly. This is taken care by simple_read_from_buffer. Reading a single byte, and then the next byte would result EOF. While this seems like not a big deal, this breaks abstractions that reads information from files unbuffered. See for example https://github.com/golang/go/issues/29399 This code was mentioned as problematic in commit cd458ba9 ("tracing: Do not (ab)use trace_seq in event_id_read()") An example C code that show this bug is: #include <stdio.h> #include <stdint.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> int main(int argc, char **argv) { if (argc < 2) return 1; int fd = open(argv[1], O_RDONLY); char c; read(fd, &c, 1); printf("First %c\n", c); read(fd, &c, 1); printf("Second %c\n", c); } Then run with, e.g. sudo ./a.out /sys/kernel/debug/tracing/events/tcp/tcp_set_state/id You'll notice you're getting the first character twice, instead of the first two characters in the id file. Link: http://lkml.kernel.org/r/20181231115837.4932-1-elazar@lightbitslabs.com Cc: Orit Wasserman <orit.was@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 23725aee ("ftrace: provide an id file for each event") Signed-off-by: NElazar Leibovich <elazar@lightbitslabs.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Peter Zijlstra 提交于
commit d6097c9e4454adf1f8f2c9547c2fa6060d55d952 upstream. Unless the very next line is schedule(), or implies it, one must not use preempt_enable_no_resched(). It can cause a preemption to go missing and thereby cause arbitrary delays, breaking the PREEMPT=y invariant. Link: http://lkml.kernel.org/r/20190423200318.GY14281@hirez.programming.kicks-ass.net Cc: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: the arch/x86 maintainers <x86@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: huang ying <huang.ying.caritas@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: stable@vger.kernel.org Fixes: 2c2d7329 ("tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jann Horn 提交于
commit b987222654f84f7b4ca95b3a55eca784cb30235b upstream. This fixes multiple issues in buffer_pipe_buf_ops: - The ->steal() handler must not return zero unless the pipe buffer has the only reference to the page. But generic_pipe_buf_steal() assumes that every reference to the pipe is tracked by the page's refcount, which isn't true for these buffers - buffer_pipe_buf_get(), which duplicates a buffer, doesn't touch the page's refcount. Fix it by using generic_pipe_buf_nosteal(), which refuses every attempted theft. It should be easy to actually support ->steal, but the only current users of pipe_buf_steal() are the virtio console and FUSE, and they also only use it as an optimization. So it's probably not worth the effort. - The ->get() and ->release() handlers can be invoked concurrently on pipe buffers backed by the same struct buffer_ref. Make them safe against concurrency by using refcount_t. - The pointers stored in ->private were only zeroed out when the last reference to the buffer_ref was dropped. As far as I know, this shouldn't be necessary anyway, but if we do it, let's always do it. Link: http://lkml.kernel.org/r/20190404215925.253531-1-jannh@google.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Fixes: 73a757e6 ("ring-buffer: Return reader page back into existing ring buffer") Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Conflicts: kernel/trace/trace.c include/linux/pipe_fs_i.h [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Wenwen Wang 提交于
commit 91862cc7867bba4ee5c8fcf0ca2f1d30427b6129 upstream. In trace_pid_write(), the buffer for trace parser is allocated through kmalloc() in trace_parser_get_init(). Later on, after the buffer is used, it is then freed through kfree() in trace_parser_put(). However, it is possible that trace_pid_write() is terminated due to unexpected errors, e.g., ENOMEM. In that case, the allocated buffer will not be freed, which is a memory leak bug. To fix this issue, free the allocated buffer when an error is encountered. Link: http://lkml.kernel.org/r/1555726979-15633-1-git-send-email-wang6495@umn.edu Fixes: f4d34a87 ("tracing: Use pid bitmap instead of a pid array for set_event_pid") Cc: stable@vger.kernel.org Signed-off-by: NWenwen Wang <wang6495@umn.edu> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Masami Hiramatsu 提交于
commit fabe38ab6b2bd9418350284c63825f13b8a6abba upstream. Mark ftrace mcount handler functions nokprobe since probing on these functions with kretprobe pushes return address incorrectly on kretprobe shadow stack. Reported-by: NFrancis Deslauriers <francis.deslauriers@efficios.com> Tested-by: NAndrea Righi <righi.andrea@gmail.com> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Matthew Wilcox 提交于
mainline inclusion from mainline-5.1-rc5 commit 15fab63e1e57be9fdb5eec1bbc5916e9825e9acb category: 13690 bugzilla: NA CVE: CVE-2019-11487 There are four commits to fix this CVE: fs: prevent page refcount overflow in pipe_buf_get mm: prevent get_user_pages() from overflowing page refcount mm: add 'try_get_page()' helper function mm: make page ref count overflow check tighter and more explicit ------------------------------------------------- Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: NJann Horn <jannh@google.com> Signed-off-by: NMatthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: Nzhong jiang <zhongjiang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 luojiajun 提交于
euler inclusion category: bugfix bugzilla: 14007 CVE: NA ------------------------------------------------- In commit b8f8307ae1e9 ("pid_ns: Make pid_max per namespace"), pid_max is made per namespace, so global pid_max in trace.h is unused now. Remove it. Signed-off-by: Nluojiajun <luojiajun3@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Li Zefan 提交于
euler inclusion category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: Nluojiajun <luojiajun3@huawei.com> Reviewed-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Douglas Anderson 提交于
mainline inclusion from mainline-5.1-rc1 commit 31b265b3baaf category: bugfix bugzilla: 12725 CVE: NA ------------------------------------------------- As reported back in 2016-11 [1], the "ftdump" kdb command triggers a BUG for "sleeping function called from invalid context". kdb's "ftdump" command wants to call ring_buffer_read_prepare() in atomic context. A very simple solution for this is to add allocation flags to ring_buffer_read_prepare() so kdb can call it without triggering the allocation error. This patch does that. Note that in the original email thread about this, it was suggested that perhaps the solution for kdb was to either preallocate the buffer ahead of time or create our own iterator. I'm hoping that this alternative of adding allocation flags to ring_buffer_read_prepare() can be considered since it means I don't need to duplicate more of the core trace code into "trace_kdb.c" (for either creating my own iterator or re-preparing a ring allocator whose memory was already allocated). NOTE: another option for kdb is to actually figure out how to make it reuse the existing ftrace_dump() function and totally eliminate the duplication. This sounds very appealing and actually works (the "sr z" command can be seen to properly dump the ftrace buffer). The downside here is that ftrace_dump() fully consumes the trace buffer. Unless that is changed I'd rather not use it because it means "ftdump | grep xyz" won't be very useful to search the ftrace buffer since it will throw away the whole trace on the first grep. A future patch to dump only the last few lines of the buffer will also be hard to implement. [1] https://lkml.kernel.org/r/20161117191605.GA21459@google.com Link: http://lkml.kernel.org/r/20190308193205.213659-1-dianders@chromium.orgReported-by: NBrian Norris <briannorris@chromium.org> Signed-off-by: NDouglas Anderson <dianders@chromium.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jann Horn 提交于
commit 83540fbc8812a580b6ad8f93f4c29e62e417687e upstream. The first version of this method was missing the check for `ret == PATH_MAX`; then such a check was added, but it didn't call kfree() on error, so there was still a small memory leak in the error case. Fix it by using strndup_user() instead of open-coding it. Link: http://lkml.kernel.org/r/20190220165443.152385-1-jannh@google.com Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Fixes: 0eadcc7a ("perf/core: Fix perf_uprobe_init()") Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org> Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tom Zanussi 提交于
commit 9f0bbf3115ca9f91f43b7c74e9ac7d79f47fc6c2 upstream. Because there may be random garbage beyond a string's null terminator, it's not correct to copy the the complete character array for use as a hist trigger key. This results in multiple histogram entries for the 'same' string key. So, in the case of a string key, use strncpy instead of memcpy to avoid copying in the extra bytes. Before, using the gdbus entries in the following hist trigger as an example: # echo 'hist:key=comm' > /sys/kernel/debug/tracing/events/sched/sched_waking/trigger # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist ... { comm: ImgDecoder #4 } hitcount: 203 { comm: gmain } hitcount: 213 { comm: gmain } hitcount: 216 { comm: StreamTrans #73 } hitcount: 221 { comm: mozStorage #3 } hitcount: 230 { comm: gdbus } hitcount: 233 { comm: StyleThread#5 } hitcount: 253 { comm: gdbus } hitcount: 256 { comm: gdbus } hitcount: 260 { comm: StyleThread#4 } hitcount: 271 ... # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l 51 After: # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l 1 Link: http://lkml.kernel.org/r/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756.git.tom.zanussi@linux.intel.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Fixes: 79e577cb ("tracing: Support string type key properly") Signed-off-by: NTom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Pavel Tikhomirov 提交于
commit 6a072128d262d2b98d31626906a96700d1fc11eb upstream. Then tracing syscall exit event it is extremely useful to filter exit codes equal to some negative value, to react only to required errors. But negative numbers does not work: [root@snorch sys_exit_read]# echo "ret == -1" > filter bash: echo: write error: Invalid argument [root@snorch sys_exit_read]# cat filter ret == -1 ^ parse_error: Invalid value (did you forget quotes)? Similar thing happens when setting triggers. These is a regression in v4.17 introduced by the commit mentioned below, testing without these commit shows no problem with negative numbers. Link: http://lkml.kernel.org/r/20180823102534.7642-1-ptikhomirov@virtuozzo.com Cc: stable@vger.kernel.org Fixes: 80765597 ("tracing: Rewrite filter logic to be simpler and faster") Signed-off-by: NPavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Quentin Perret 提交于
commit 9e7382153f80ba45a0bbcd540fb77d4b15f6e966 upstream. The following commit 441dae8f ("tracing: Add support for display of tgid in trace output") removed the call to print_event_info() from print_func_help_header_irq() which results in the ftrace header not reporting the number of entries written in the buffer. As this wasn't the original intent of the patch, re-introduce the call to print_event_info() to restore the orginal behaviour. Link: http://lkml.kernel.org/r/20190214152950.4179-1-quentin.perret@arm.comAcked-by: NJoel Fernandes <joelaf@google.com> Cc: stable@vger.kernel.org Fixes: 441dae8f ("tracing: Add support for display of tgid in trace output") Signed-off-by: NQuentin Perret <quentin.perret@arm.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Andreas Ziegler 提交于
commit 0722069a upstream. When printing multiple uprobe arguments as strings the output for the earlier arguments would also include all later string arguments. This is best explained in an example: Consider adding a uprobe to a function receiving two strings as parameters which is at offset 0xa0 in strlib.so and we want to print both parameters when the uprobe is hit (on x86_64): $ echo 'p:func /lib/strlib.so:0xa0 +0(%di):string +0(%si):string' > \ /sys/kernel/debug/tracing/uprobe_events When the function is called as func("foo", "bar") and we hit the probe, the trace file shows a line like the following: [...] func: (0x7f7e683706a0) arg1="foobar" arg2="bar" Note the extra "bar" printed as part of arg1. This behaviour stacks up for additional string arguments. The strings are stored in a dynamically growing part of the uprobe buffer by fetch_store_string() after copying them from userspace via strncpy_from_user(). The return value of strncpy_from_user() is then directly used as the required size for the string. However, this does not take the terminating null byte into account as the documentation for strncpy_from_user() cleary states that it "[...] returns the length of the string (not including the trailing NUL)" even though the null byte will be copied to the destination. Therefore, subsequent calls to fetch_store_string() will overwrite the terminating null byte of the most recently fetched string with the first character of the current string, leading to the "accumulation" of strings in earlier arguments in the output. Fix this by incrementing the return value of strncpy_from_user() by one if we did not hit the maximum buffer size. Link: http://lkml.kernel.org/r/20190116141629.5752-1-andreas.ziegler@fau.de Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 5baaa59e ("tracing/probes: Implement 'memory' fetch method for uprobes") Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NAndreas Ziegler <andreas.ziegler@fau.de> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alexei Starovoitov 提交于
mainline inclusion from mainline-5.0 commit e16ec34039c7 category: bugfix bugzilla: 9347 CVE: NA ------------------------------------------------- Lockdep found a potential deadlock between cpu_hotplug_lock, bpf_event_mutex, and cpuctx_mutex: [ 13.007000] WARNING: possible circular locking dependency detected [ 13.007587] 5.0.0-rc3-00018-g2fa53f892422-dirty #477 Not tainted [ 13.008124] ------------------------------------------------------ [ 13.008624] test_progs/246 is trying to acquire lock: [ 13.009030] 0000000094160d1d (tracepoints_mutex){+.+.}, at: tracepoint_probe_register_prio+0x2d/0x300 [ 13.009770] [ 13.009770] but task is already holding lock: [ 13.010239] 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60 [ 13.010877] [ 13.010877] which lock already depends on the new lock. [ 13.010877] [ 13.011532] [ 13.011532] the existing dependency chain (in reverse order) is: [ 13.012129] [ 13.012129] -> #4 (bpf_event_mutex){+.+.}: [ 13.012582] perf_event_query_prog_array+0x9b/0x130 [ 13.013016] _perf_ioctl+0x3aa/0x830 [ 13.013354] perf_ioctl+0x2e/0x50 [ 13.013668] do_vfs_ioctl+0x8f/0x6a0 [ 13.014003] ksys_ioctl+0x70/0x80 [ 13.014320] __x64_sys_ioctl+0x16/0x20 [ 13.014668] do_syscall_64+0x4a/0x180 [ 13.015007] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.015469] [ 13.015469] -> #3 (&cpuctx_mutex){+.+.}: [ 13.015910] perf_event_init_cpu+0x5a/0x90 [ 13.016291] perf_event_init+0x1b2/0x1de [ 13.016654] start_kernel+0x2b8/0x42a [ 13.016995] secondary_startup_64+0xa4/0xb0 [ 13.017382] [ 13.017382] -> #2 (pmus_lock){+.+.}: [ 13.017794] perf_event_init_cpu+0x21/0x90 [ 13.018172] cpuhp_invoke_callback+0xb3/0x960 [ 13.018573] _cpu_up+0xa7/0x140 [ 13.018871] do_cpu_up+0xa4/0xc0 [ 13.019178] smp_init+0xcd/0xd2 [ 13.019483] kernel_init_freeable+0x123/0x24f [ 13.019878] kernel_init+0xa/0x110 [ 13.020201] ret_from_fork+0x24/0x30 [ 13.020541] [ 13.020541] -> #1 (cpu_hotplug_lock.rw_sem){++++}: [ 13.021051] static_key_slow_inc+0xe/0x20 [ 13.021424] tracepoint_probe_register_prio+0x28c/0x300 [ 13.021891] perf_trace_event_init+0x11f/0x250 [ 13.022297] perf_trace_init+0x6b/0xa0 [ 13.022644] perf_tp_event_init+0x25/0x40 [ 13.023011] perf_try_init_event+0x6b/0x90 [ 13.023386] perf_event_alloc+0x9a8/0xc40 [ 13.023754] __do_sys_perf_event_open+0x1dd/0xd30 [ 13.024173] do_syscall_64+0x4a/0x180 [ 13.024519] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.024968] [ 13.024968] -> #0 (tracepoints_mutex){+.+.}: [ 13.025434] __mutex_lock+0x86/0x970 [ 13.025764] tracepoint_probe_register_prio+0x2d/0x300 [ 13.026215] bpf_probe_register+0x40/0x60 [ 13.026584] bpf_raw_tracepoint_open.isra.34+0xa4/0x130 [ 13.027042] __do_sys_bpf+0x94f/0x1a90 [ 13.027389] do_syscall_64+0x4a/0x180 [ 13.027727] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.028171] [ 13.028171] other info that might help us debug this: [ 13.028171] [ 13.028807] Chain exists of: [ 13.028807] tracepoints_mutex --> &cpuctx_mutex --> bpf_event_mutex [ 13.028807] [ 13.029666] Possible unsafe locking scenario: [ 13.029666] [ 13.030140] CPU0 CPU1 [ 13.030510] ---- ---- [ 13.030875] lock(bpf_event_mutex); [ 13.031166] lock(&cpuctx_mutex); [ 13.031645] lock(bpf_event_mutex); [ 13.032135] lock(tracepoints_mutex); [ 13.032441] [ 13.032441] *** DEADLOCK *** [ 13.032441] [ 13.032911] 1 lock held by test_progs/246: [ 13.033239] #0: 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60 [ 13.033909] [ 13.033909] stack backtrace: [ 13.034258] CPU: 1 PID: 246 Comm: test_progs Not tainted 5.0.0-rc3-00018-g2fa53f892422-dirty #477 [ 13.034964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 [ 13.035657] Call Trace: [ 13.035859] dump_stack+0x5f/0x8b [ 13.036130] print_circular_bug.isra.37+0x1ce/0x1db [ 13.036526] __lock_acquire+0x1158/0x1350 [ 13.036852] ? lock_acquire+0x98/0x190 [ 13.037154] lock_acquire+0x98/0x190 [ 13.037447] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.037876] __mutex_lock+0x86/0x970 [ 13.038167] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.038600] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.039028] ? __mutex_lock+0x86/0x970 [ 13.039337] ? __mutex_lock+0x24a/0x970 [ 13.039649] ? bpf_probe_register+0x1d/0x60 [ 13.039992] ? __bpf_trace_sched_wake_idle_without_ipi+0x10/0x10 [ 13.040478] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.040906] tracepoint_probe_register_prio+0x2d/0x300 [ 13.041325] bpf_probe_register+0x40/0x60 [ 13.041649] bpf_raw_tracepoint_open.isra.34+0xa4/0x130 [ 13.042068] ? __might_fault+0x3e/0x90 [ 13.042374] __do_sys_bpf+0x94f/0x1a90 [ 13.042678] do_syscall_64+0x4a/0x180 [ 13.042975] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.043382] RIP: 0033:0x7f23b10a07f9 [ 13.045155] RSP: 002b:00007ffdef42fdd8 EFLAGS: 00000202 ORIG_RAX: 0000000000000141 [ 13.045759] RAX: ffffffffffffffda RBX: 00007ffdef42ff70 RCX: 00007f23b10a07f9 [ 13.046326] RDX: 0000000000000070 RSI: 00007ffdef42fe10 RDI: 0000000000000011 [ 13.046893] RBP: 00007ffdef42fdf0 R08: 0000000000000038 R09: 00007ffdef42fe10 [ 13.047462] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 [ 13.048029] R13: 0000000000000016 R14: 00007f23b1db4690 R15: 0000000000000000 Since tracepoints_mutex will be taken in tracepoint_probe_register/unregister() there is no need to take bpf_event_mutex too. bpf_event_mutex is protecting modifications to prog array used in kprobe/perf bpf progs. bpf_raw_tracepoints don't need to take this mutex. Fixes: c4f6699d ("bpf: introduce BPF_RAW_TRACEPOINT") Acked-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Matt Mullins 提交于
mainline inclusion from mainline-5.0 commit a38d1107f937 category: bugfix bugzilla: 9347 CVE: NA ------------------------------------------------- Distributions build drivers as modules, including network and filesystem drivers which export numerous tracepoints. This enables bpf(BPF_RAW_TRACEPOINT_OPEN) to attach to those tracepoints. Signed-off-by: NMatt Mullins <mmullins@fb.com> Acked-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 zhangyi (F) 提交于
euler inclusion category: bugfix bugzilla: 9292 CVE: NA --------------------------- Commit d716ff71 ("tracing: Remove taking of trace_types_lock in pipe files") use the current tracer instead of the copy in tracing_open_pipe(), but it forget to remove the freeing sentence in the error path. Fixes: d716ff71 ("tracing: Remove taking of trace_types_lock in pipe files") Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Reviewed-by: NLi Bin <huawei.libin@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Andreas Ziegler 提交于
commit ea6eb5e7d15e1838de335609994b4546e2abcaaf upstream. The subsystem-specific message prefix for uprobes was also "trace_kprobe: " instead of "trace_uprobe: " as described in the original commit message. Link: http://lkml.kernel.org/r/20190117133023.19292-1-andreas.ziegler@fau.de Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Fixes: 72576341 ("tracing/probe: Show subsystem name in messages") Signed-off-by: NAndreas Ziegler <andreas.ziegler@fau.de> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Dan Carpenter 提交于
mainline inclusion from mainline-5.0 commit ca16b0fbb052 category: bugfix bugzilla: 5788 CVE: NA ------------------------------------------------- Dan Carpenter reviewed the trace_stack.c code and figured he found an off by one bug. "From reviewing the code, it seems possible for stack_trace_max.nr_entries to be set to .max_entries and in that case we would be reading one element beyond the end of the stack_dump_trace[] array. If it's not set to .max_entries then the bug doesn't affect runtime." Although it looks to be the case, it is not. Because we have: static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] = { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX }; struct stack_trace stack_trace_max = { .max_entries = STACK_TRACE_ENTRIES - 1, .entries = &stack_dump_trace[0], }; And: stack_trace_max.nr_entries = x; for (; x < i; x++) stack_dump_trace[x] = ULONG_MAX; Even if nr_entries equals max_entries, indexing with it into the stack_dump_trace[] array will not overflow the array. But if it is the case, the second part of the conditional that tests stack_dump_trace[nr_entries] to ULONG_MAX will always be true. By applying Dan's patch, it removes the subtle aspect of it and makes the if conditional slightly more efficient. Link: http://lkml.kernel.org/r/20180620110758.crunhd5bfep7zuiz@kili.mountainSigned-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> -
由 Linus Torvalds 提交于
mainline inclusion from mainline-5.0-rc1 commit 96d4f267e40f9509e8a66e2b39e8b95655617693 category: cleanup bugzilla: 9284 CVE: NA It's a cleanup patch that prepare for applying CVE-2018-20669 patch 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'") ------------------------------------------------- Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Conflicts: drivers/media/v4l2-core/v4l2-compat-ioctl32.c drivers/infiniband/core/uverbs_main.c drivers/platform/goldfish/goldfish_pipe.c fs/namespace.c fs/select.c kernel/compat.c arch/powerpc/include/asm/uaccess.h arch/arm64/kernel/perf_callchain.c arch/arm64/include/asm/uaccess.h arch/ia64/kernel/signal.c arch/x86/entry/vsyscall/vsyscall_64.c Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 20 12月, 2018 3 次提交
-
-
由 Steven Rostedt (VMware) 提交于
commit 2840f84f74035e5a535959d5f17269c69fa6edc5 upstream. The following commands will cause a memory leak: # cd /sys/kernel/tracing # mkdir instances/foo # echo schedule > instance/foo/set_ftrace_filter # rmdir instances/foo The reason is that the hashes that hold the filters to set_ftrace_filter and set_ftrace_notrace are not freed if they contain any data on the instance and the instance is removed. Found by kmemleak detector. Cc: stable@vger.kernel.org Fixes: 591dffda ("ftrace: Allow for function tracing instance to filter functions") Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Steven Rostedt (VMware) 提交于
commit 3cec638b3d793b7cacdec5b8072364b41caeb0e1 upstream. When create_event_filter() fails in set_trigger_filter(), the filter may still be allocated and needs to be freed. The caller expects the data->filter to be updated with the new filter, even if the new filter failed (we could add an error message by setting set_str parameter of create_event_filter(), but that's another update). But because the error would just exit, filter was left hanging and nothing could free it. Found by kmemleak detector. Cc: stable@vger.kernel.org Fixes: bac5fb97 ("tracing: Add and use generic set_trigger_filter() implementation") Reviewed-by: NTom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Steven Rostedt (VMware) 提交于
commit b61c19209c2c35ea2a2fe502d484703686eba98c upstream. The create_filter() calls create_filter_start() which allocates a "parse_error" descriptor, but fails to call create_filter_finish() that frees it. The op_stack and inverts in predicate_parse() were also not freed. Found by kmemleak detector. Cc: stable@vger.kernel.org Fixes: 80765597 ("tracing: Rewrite filter logic to be simpler and faster") Reviewed-by: NTom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 17 12月, 2018 1 次提交
-
-
由 Martynas Pumputis 提交于
[ Upstream commit 1efb6ee3edea57f57f9fb05dba8dcb3f7333f61f ] A format string consisting of "%p" or "%s" followed by an invalid specifier (e.g. "%p%\n" or "%s%") could pass the check which would make format_decode (lib/vsprintf.c) to warn. Fixes: 9c959c86 ("tracing: Allow BPF programs to call bpf_trace_printk()") Reported-by: syzbot+1ec5c5ec949c4adaa0c4@syzkaller.appspotmail.com Signed-off-by: NMartynas Pumputis <m@lambda.lt> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 08 12月, 2018 1 次提交
-
-
由 Steven Rostedt (VMware) 提交于
commit 5cf99a0f3161bc3ae2391269d134d6bf7e26f00e upstream. The tracefs file set_graph_function is used to only function graph functions that are listed in that file (or all functions if the file is empty). The way this is implemented is that the function graph tracer looks at every function, and if the current depth is zero and the function matches something in the file then it will trace that function. When other functions are called, the depth will be greater than zero (because the original function will be at depth zero), and all functions will be traced where the depth is greater than zero. The issue is that when a function is first entered, and the handler that checks this logic is called, the depth is set to zero. If an interrupt comes in and a function in the interrupt handler is traced, its depth will be greater than zero and it will automatically be traced, even if the original function was not. But because the logic only looks at depth it may trace interrupts when it should not be. The recent design change of the function graph tracer to fix other bugs caused the depth to be zero while the function graph callback handler is being called for a longer time, widening the race of this happening. This bug was actually there for a longer time, but because the race window was so small it seldom happened. The Fixes tag below is for the commit that widen the race window, because that commit belongs to a series that will also help fix the original bug. Cc: stable@kernel.org Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack") Reported-by: NJoe Lawrence <joe.lawrence@redhat.com> Tested-by: NJoe Lawrence <joe.lawrence@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 06 12月, 2018 4 次提交
-
-
由 Steven Rostedt (VMware) 提交于
commit 7c6ea35ef50810aa12ab26f21cb858d980881576 upstream. The function graph profiler uses the ret_stack to store the "subtime" and reuse it by nested functions and also on the return. But the current logic has the profiler callback called before the ret_stack is updated, and it is just modifying the ret_stack that will later be allocated (it's just lucky that the "subtime" is not touched when it is allocated). This could also cause a crash if we are at the end of the ret_stack when this happens. By reversing the order of the allocating the ret_stack and then calling the callbacks attached to a function being traced, the ret_stack entry is no longer used before it is allocated. Cc: stable@kernel.org Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Steven Rostedt (VMware) 提交于
commit 552701dd0fa7c3d448142e87210590ba424694a0 upstream. In the past, curr_ret_stack had two functions. One was to denote the depth of the call graph, the other is to keep track of where on the ret_stack the data is used. Although they may be slightly related, there are two cases where they need to be used differently. The one case is that it keeps the ret_stack data from being corrupted by an interrupt coming in and overwriting the data still in use. The other is just to know where the depth of the stack currently is. The function profiler uses the ret_stack to save a "subtime" variable that is part of the data on the ret_stack. If curr_ret_stack is modified too early, then this variable can be corrupted. The "max_depth" option, when set to 1, will record the first functions going into the kernel. To see all top functions (when dealing with timings), the depth variable needs to be lowered before calling the return hook. But by lowering the curr_ret_stack, it makes the data on the ret_stack still being used by the return hook susceptible to being overwritten. Now that there's two variables to handle both cases (curr_ret_depth), we can move them to the locations where they can handle both cases. Cc: stable@kernel.org Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Steven Rostedt (VMware) 提交于
commit b1b35f2e218a5b57d03bbc3b0667d5064570dc60 upstream. The profiler uses trace->depth to find its entry on the ret_stack, but the depth may not match the actual location of where its entry is (if an interrupt were to preempt the processing of the profiler for another function, the depth and the curr_ret_stack will be different). Have it use the curr_ret_stack as the index to find its ret_stack entry instead of using the depth variable, as that is no longer guaranteed to be the same. Cc: stable@kernel.org Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Steven Rostedt (VMware) 提交于
commit 39eb456d upstream. Currently, the depth of the ret_stack is determined by curr_ret_stack index. The issue is that there's a race between setting of the curr_ret_stack and calling of the callback attached to the return of the function. Commit 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") moved the calling of the callback to after the setting of the curr_ret_stack, even stating that it was safe to do so, when in fact, it was the reason there was a barrier() there (yes, I should have commented that barrier()). Not only does the curr_ret_stack keep track of the current call graph depth, it also keeps the ret_stack content from being overwritten by new data. The function profiler, uses the "subtime" variable of ret_stack structure and by moving the curr_ret_stack, it allows for interrupts to use the same structure it was using, corrupting the data, and breaking the profiler. To fix this, there needs to be two variables to handle the call stack depth and the pointer to where the ret_stack is being used, as they need to change at two different locations. Cc: stable@kernel.org Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-