- 28 6月, 2015 1 次提交
-
-
由 Rusty Russell 提交于
As Dan Streetman points out, the entire point of locking for is to stop sysfs accesses, so they're elided entirely in the !SYSFS case. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 23 6月, 2015 1 次提交
-
-
由 Dan Streetman 提交于
Add a "param_lock" mutex to each module, and update params.c to use the correct built-in or module mutex while locking kernel params. Remove the kparam_block_sysfs_r/w() macros, replace them with direct calls to kernel_param_[un]lock(module). The kernel param code currently uses a single mutex to protect modification of any and all kernel params. While this generally works, there is one specific problem with it; a module callback function cannot safely load another module, i.e. with request_module() or even with indirect calls such as crypto_has_alg(). If the module to be loaded has any of its params configured (e.g. with a /etc/modprobe.d/* config file), then the attempt will result in a deadlock between the first module param callback waiting for modprobe, and modprobe trying to lock the single kernel param mutex to set the new module's param. This fixes that by using per-module mutexes, so that each individual module is protected against concurrent changes in its own kernel params, but is not blocked by changes to other module params. All built-in modules continue to use the built-in mutex, since they will always be loaded at runtime and references (e.g. request_module(), crypto_has_alg()) to them will never cause load-time param changing. This also simplifies the interface used by modules to block sysfs access to their params; while there are currently functions to block and unblock sysfs param access which are split up by read and write and expect a single kernel param to be passed, their actual operation is identical and applies to all params, not just the one passed to them; they simply lock and unlock the global param mutex. They are replaced with direct calls to kernel_param_[un]lock(THIS_MODULE), which locks THIS_MODULE's param_lock, or if the module is built-in, it locks the built-in mutex. Suggested-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDan Streetman <ddstreet@ieee.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 28 5月, 2015 8 次提交
-
-
由 Luis R. Rodriguez 提交于
There's no need to require an ifdef over the declaration of sig_enforce as IS_ENABLED() can be used. While at it, there's no harm in exposing this kernel parameter outside of CONFIG_MODULE_SIG as it'd be a no-op on non module sig kernels. Now, technically we should in theory be able to remove the #ifdef'ery over the declaration of the module parameter as we are also trusting the bool_enable_only code for CONFIG_MODULE_SIG kernels but for now remain paranoid and keep it. With time if no one can put a bullet through bool_enable_only and if there are no technical requirements over not exposing CONFIG_MODULE_SIG_FORCE with the measures in place by bool_enable_only we could remove this last ifdef. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: cocci@systeme.lip6.fr Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Luis R. Rodriguez 提交于
This takes out the bool_enable_only implementation from the module loading code and generalizes it so that others can make use of it. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: cocci@systeme.lip6.fr Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Luis R. Rodriguez 提交于
We're directly checking and modifying sig_enforce when needed instead of using the generic helpers. This prevents us from generalizing this helper so that others can use it. Use indirect helpers to allow us to generalize this code a bit and to make it a bit more clear what this is doing. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: cocci@systeme.lip6.fr Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Peter Zijlstra 提交于
__module_address() does an initial bound check before doing the {list/tree} iteration to find the actual module. The bound variables are nowhere near the mod_tree cacheline, in fact they're nowhere near one another. module_addr_min lives in .data while module_addr_max lives in .bss (smarty pants GCC thinks the explicit 0 assignment is a mistake). Rectify this by moving the two variables into a structure together with the latch_tree_root to guarantee they all share the same cacheline and avoid hitting two extra cachelines for the lookup. While reworking the bounds code, move the bound update from allocation to insertion time, this avoids updating the bounds for a few error paths. Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Peter Zijlstra 提交于
Use the generic __module_address() addr to struct module lookup instead of open coding it once more. Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Peter Zijlstra 提交于
Andrew worried about the overhead on small systems; only use the fancy code when either perf or tracing is enabled. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Requested-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Peter Zijlstra 提交于
Currently __module_address() is using a linear search through all modules in order to find the module corresponding to the provided address. With a lot of modules this can take a lot of time. One of the users of this is kernel_text_address() which is employed in many stack unwinders; which in turn are used by perf-callchain and ftrace (possibly from NMI context). So by optimizing __module_address() we optimize many stack unwinders which are used by both perf and tracing in performance sensitive code. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Peter Zijlstra 提交于
Currently the RCU usage in module is an inconsistent mess of RCU and RCU-sched, this is broken for CONFIG_PREEMPT where synchronize_rcu() does not imply synchronize_sched(). Most usage sites use preempt_{dis,en}able() which is RCU-sched, but (most of) the modification sites use synchronize_rcu(). With the exception of the module bug list, which actually uses RCU. Convert everything over to RCU-sched. Furthermore add lockdep asserts to all sites, because it's not at all clear to me the required locking is observed, esp. on exported functions. Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 27 5月, 2015 1 次提交
-
-
由 Peter Zijlstra 提交于
Due to the new lockdep checks in the coming patch, we go: [ 9.759380] ------------[ cut here ]------------ [ 9.759389] WARNING: CPU: 31 PID: 597 at ../kernel/module.c:216 each_symbol_section+0x121/0x130() [ 9.759391] Modules linked in: [ 9.759393] CPU: 31 PID: 597 Comm: modprobe Not tainted 4.0.0-rc1+ #65 [ 9.759393] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013 [ 9.759396] ffffffff817d8676 ffff880424567ca8 ffffffff8157e98b 0000000000000001 [ 9.759398] 0000000000000000 ffff880424567ce8 ffffffff8105fbc7 ffff880424567cd8 [ 9.759400] 0000000000000000 ffffffff810ec160 ffff880424567d40 0000000000000000 [ 9.759400] Call Trace: [ 9.759407] [<ffffffff8157e98b>] dump_stack+0x4f/0x7b [ 9.759410] [<ffffffff8105fbc7>] warn_slowpath_common+0x97/0xe0 [ 9.759412] [<ffffffff810ec160>] ? section_objs+0x60/0x60 [ 9.759414] [<ffffffff8105fc2a>] warn_slowpath_null+0x1a/0x20 [ 9.759415] [<ffffffff810ed9c1>] each_symbol_section+0x121/0x130 [ 9.759417] [<ffffffff810eda01>] find_symbol+0x31/0x70 [ 9.759420] [<ffffffff810ef5bf>] load_module+0x20f/0x2660 [ 9.759422] [<ffffffff8104ef10>] ? __do_page_fault+0x190/0x4e0 [ 9.759426] [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13 [ 9.759427] [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13 [ 9.759433] [<ffffffff810ae73d>] ? trace_hardirqs_on_caller+0x11d/0x1e0 [ 9.759437] [<ffffffff812fcc0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 9.759439] [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13 [ 9.759441] [<ffffffff810f1ade>] SyS_init_module+0xce/0x100 [ 9.759443] [<ffffffff81587429>] system_call_fastpath+0x12/0x17 [ 9.759445] ---[ end trace 9294429076a9c644 ]--- As per the comment this site should be fine, but lets wrap it in preempt_disable() anyhow to placate lockdep. Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 09 4月, 2015 1 次提交
-
-
由 Linus Torvalds 提交于
Unlike most (all?) other copies from user space, kernel module loading is almost unlimited in size. So we do a potentially huge "copy_from_user()" when we copy the module data from user space to the kernel buffer, which can be a latency concern when preemption is disabled (or voluntary). Also, because 'copy_from_user()' clears the tail of the kernel buffer on failures, even a *failed* copy can end up wasting a lot of time. Normally neither of these are concerns in real life, but they do trigger when doing stress-testing with trinity. Running in a VM seems to add its own overheadm causing trinity module load testing to even trigger the watchdog. The simple fix is to just chunk up the module loading, so that it never tries to copy insanely big areas in one go. That bounds the latency, and also the amount of (unnecessarily, in this case) cleared memory for the failure case. Reported-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 4月, 2015 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
Update the infrastructure such that modules that declare TRACE_DEFINE_ENUM() will have those enums converted into their values in the tracepoint print fmt strings. Link: http://lkml.kernel.org/r/87vbhjp74q.fsf@rustcorp.com.auAcked-by: NRusty Russell <rusty@rustcorp.com.au> Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 24 3月, 2015 2 次提交
-
-
由 Kirill A. Shutemov 提交于
init_module(2) passes user-specified buffer length directly to vmalloc(). It makes warn_alloc_failed() to print out a lot of info into dmesg if user specified insane size, like -1. Let's silence the warning. It doesn't add much value to -ENOMEM return code. Without the patch the syscall is prohibitive noisy for testing with trinity. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Yannick Guerrini 提交于
Fix typos in pr_warn message about unused symbols Signed-off-by: NYannick Guerrini <yguerrini@tomshardware.fr> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 23 3月, 2015 1 次提交
-
-
由 Peter Zijlstra 提交于
Module unload calls lockdep_free_key_range(), which removes entries from the data structures. Most of the lockdep code OTOH assumes the data structures are append only; in specific see the comments in add_lock_to_list() and look_up_lock_class(). Clearly this has only worked by accident; make it work proper. The actual scenario to make it go boom would involve the memory freed by the module unlock being re-allocated and re-used for a lock inside of a rcu-sched grace period. This is a very unlikely scenario, still better plug the hole. Use RCU list iteration in all places and ammend the comments. Change lockdep_free_key_range() to issue a sync_sched() between removal from the lists and returning -- which results in the memory being freed. Further ensure the callers are placed correctly and comment the requirements. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Tsyvarev <tsyvarev@ispras.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 3月, 2015 1 次提交
-
-
由 Andrey Ryabinin 提交于
Current approach in handling shadow memory for modules is broken. Shadow memory could be freed only after memory shadow corresponds it is no longer used. vfree() called from interrupt context could use memory its freeing to store 'struct llist_node' in it: void vfree(const void *addr) { ... if (unlikely(in_interrupt())) { struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred); if (llist_add((struct llist_node *)addr, &p->list)) schedule_work(&p->wq); Later this list node used in free_work() which actually frees memory. Currently module_memfree() called in interrupt context will free shadow before freeing module's memory which could provoke kernel crash. So shadow memory should be freed after module's memory. However, such deallocation order could race with kasan_module_alloc() in module_alloc(). Free shadow right before releasing vm area. At this point vfree()'d memory is not used anymore and yet not available for other allocations. New VM_KASAN flag used to indicate that vm area has dynamically allocated shadow memory so kasan frees shadow only if it was previously allocated. Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 3月, 2015 1 次提交
-
-
由 Laura Abbott 提交于
When CONFIG_DEBUG_SET_MODULE_RONX is enabled, the sizes of module sections are aligned up so appropriate permissions can be applied. Adjusting for the symbol table may cause them to become unaligned. Make sure to re-align the sizes afterward. Signed-off-by: NLaura Abbott <lauraa@codeaurora.org> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 18 2月, 2015 1 次提交
-
-
由 Jan Kiszka 提交于
This provides a reliable breakpoint target, required for automatic symbol loading via the gdb helper command 'lx-symbols'. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 2月, 2015 1 次提交
-
-
由 Andrey Ryabinin 提交于
This feature let us to detect accesses out of bounds of global variables. This will work as for globals in kernel image, so for globals in modules. Currently this won't work for symbols in user-specified sections (e.g. __init, __read_mostly, ...) The idea of this is simple. Compiler increases each global variable by redzone size and add constructors invoking __asan_register_globals() function. Information about global variable (address, size, size with redzone ...) passed to __asan_register_globals() so we could poison variable's redzone. This patch also forces module_alloc() to return 8*PAGE_SIZE aligned address making shadow memory handling ( kasan_module_alloc()/kasan_module_free() ) more simple. Such alignment guarantees that each shadow page backing modules address space correspond to only one module_alloc() allocation. Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 2月, 2015 2 次提交
-
-
由 Peter Zijlstra 提交于
Since the introduction of the nested sleep warning; we've established that the occasional sleep inside a wait_event() is fine. wait_event() loops are invariant wrt. spurious wakeups, and the occasional sleep has a similar effect on them. As long as its occasional its harmless. Therefore replace the 'correct' but verbose wait_woken() thing with a simple annotation to shut up the warning. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Peter Zijlstra 提交于
Because wait_event() loops are safe vs spurious wakeups we can allow the occasional sleep -- which ends up being very similar. Reported-by: NDave Jones <davej@codemonkey.org.uk> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDave Jones <davej@codemonkey.org.uk> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 06 2月, 2015 2 次提交
-
-
由 Marcel Holtmann 提交于
The warning message when loading modules with a wrong signature has two spaces in it: "module verification failed: signature and/or required key missing" Signed-off-by: NMarcel Holtmann <marcel@holtmann.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Andrey Tsyvarev 提交于
parse_args call module parameters' .set handlers, which may use locks defined in the module. So, these classes should be freed in case parse_args returns error(e.g. due to incorrect parameter passed). Signed-off-by: NAndrey Tsyvarev <tsyvarev@ispras.ru> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 22 1月, 2015 1 次提交
-
-
由 Rusty Russell 提交于
James Bottomley points out that it will be -1 during unload. It's only used for diagnostics, so let's not hide that as it could be a clue as to what's gone wrong. Cc: Jason Wessel <jason.wessel@windriver.com> Acked-and-documention-added-by: NJames Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: NMasami Hiramatsu <maasami.hiramatsu.pt@hitachi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 20 1月, 2015 3 次提交
-
-
由 Rusty Russell 提交于
The kallsyms routines (module_symbol_name, lookup_module_* etc) disable preemption to walk the modules rather than taking the module_mutex: this is because they are used for symbol resolution during oopses. This works because there are synchronize_sched() and synchronize_rcu() in the unload and failure paths. However, there's one case which doesn't have that: the normal case where module loading succeeds, and we free the init section. We don't want a synchronize_rcu() there, because it would slow down module loading: this bug was introduced in 2009 to speed module loading in the first place. Thus, we want to do the free in an RCU callback. We do this in the simplest possible way by allocating a new rcu_head: if we put it in the module structure we'd have to worry about that getting freed. Reported-by: NRui Xiang <rui.xiang@huawei.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Nothing needs the module pointer any more, and the next patch will call it from RCU, where the module itself might no longer exist. Removing the arg is the safest approach. This just codifies the use of the module_alloc/module_free pattern which ftrace and bpf use. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NAlexei Starovoitov <ast@kernel.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: x86@kernel.org Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: linux-cris-kernel@axis.com Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: nios2-dev@lists.rocketboards.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Cc: netdev@vger.kernel.org
-
由 Rusty Russell 提交于
Archs have been abusing module_free() to clean up their arch-specific allocations. Since module_free() is also (ab)used by BPF and trace code, let's keep it to simple allocations, and provide a hook called before that. This means that avr32, ia64, parisc and s390 no longer need to implement their own module_free() at all. avr32 doesn't need module_finalize() either. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-kernel@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linux-s390@vger.kernel.org
-
- 11 11月, 2014 6 次提交
-
-
由 Ionut Alexa 提交于
Fixed codin style errors and warnings. Changes printk with print_debug/warn. Changed seq_printf to seq_puts. Signed-off-by: NIonut Alexa <ionut.m.alexa@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed bogus KERN_DEFAULT conversion)
-
由 Masami Hiramatsu 提交于
Remove stop_machine from module unloading by adding new reference counting algorithm. This atomic refcounter works like a semaphore, it can get (be incremented) only when the counter is not 0. When loading a module, kmodule subsystem sets the counter MODULE_REF_BASE (= 1). And when unloading the module, it subtracts MODULE_REF_BASE from the counter. If no one refers the module, the refcounter becomes 0 and we can remove the module safely. If someone referes it, we try to recover the counter by adding MODULE_REF_BASE unless the counter becomes 0, because the referrer can put the module right before recovering. If the recovering is failed, we can get the 0 refcount and it never be incremented again, it can be removed safely too. Note that __module_get() forcibly gets the module refcounter, users should use try_module_get() instead of that. Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Masami Hiramatsu 提交于
Replace module_ref per-cpu complex reference counter with an atomic_t simple refcnt. This is for code simplification. Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Masami Hiramatsu 提交于
Actually since module_bug_list should be used in BUG context, we may not need this. But for someone who want to use this from normal context, this makes module_bug_list an RCU list. Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Masami Hiramatsu 提交于
Unlink module from module list with RCU synchronizing instead of using stop_machine(). Since module list is already protected by rcu, we don't need stop_machine() anymore. Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Masami Hiramatsu 提交于
Wait for RCU synchronizing on failure path of module loading before releasing struct module, because the memory of mod->list can still be accessed by list walkers (e.g. kallsyms). Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 28 10月, 2014 1 次提交
-
-
由 Peter Zijlstra 提交于
This is a genuine bug in add_unformed_module(), we cannot use blocking primitives inside a wait loop. So rewrite the wait_event_interruptible() usage to use the fresh wait_woken() stuff. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: tglx@linutronix.de Cc: ilya.dryomov@inktank.com Cc: umgwanakikbuti@gmail.com Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: oleg@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20140924082242.458562904@infradead.org [ So this is probably complex to backport and the race wasn't reported AFAIK, so not marked for -stable. ] Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 15 10月, 2014 1 次提交
-
-
由 Prarit Bhargava 提交于
A panic was seen in the following sitation. There are two threads running on the system. The first thread is a system monitoring thread that is reading /proc/modules. The second thread is loading and unloading a module (in this example I'm using my simple dummy-module.ko). Note, in the "real world" this occurred with the qlogic driver module. When doing this, the following panic occurred: ------------[ cut here ]------------ kernel BUG at kernel/module.c:3739! invalid opcode: 0000 [#1] SMP Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module] CPU: 37 PID: 186343 Comm: cat Tainted: GF O-------------- 3.10.0+ #7 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013 task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000 RIP: 0010:[<ffffffff810d64c5>] [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP: 0018:ffff88080fa7fe18 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000 RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000 RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000 R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000 FS: 00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00 ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0 Call Trace: [<ffffffff810d666c>] m_show+0x19c/0x1e0 [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0 [<ffffffff812281ed>] proc_reg_read+0x3d/0x80 [<ffffffff811c0f2c>] vfs_read+0x9c/0x170 [<ffffffff811c1a58>] SyS_read+0x58/0xb0 [<ffffffff81605829>] system_call_fastpath+0x16/0x1b Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 RIP [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP <ffff88080fa7fe18> Consider the two processes running on the system. CPU 0 (/proc/modules reader) CPU 1 (loading/unloading module) CPU 0 opens /proc/modules, and starts displaying data for each module by traversing the modules list via fs/seq_file.c:seq_open() and fs/seq_file.c:seq_read(). For each module in the modules list, seq_read does op->start() <-- this is a pointer to m_start() op->show() <- this is a pointer to m_show() op->stop() <-- this is a pointer to m_stop() The m_start(), m_show(), and m_stop() module functions are defined in kernel/module.c. The m_start() and m_stop() functions acquire and release the module_mutex respectively. ie) When reading /proc/modules, the module_mutex is acquired and released for each module. m_show() is called with the module_mutex held. It accesses the module struct data and attempts to write out module data. It is in this code path that the above BUG_ON() warning is encountered, specifically m_show() calls static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); ... The other thread, CPU 1, in unloading the module calls the syscall delete_module() defined in kernel/module.c. The module_mutex is acquired for a short time, and then released. free_module() is called without the module_mutex. free_module() then sets mod->state = MODULE_STATE_UNFORMED, also without the module_mutex. Some additional code is called and then the module_mutex is reacquired to remove the module from the modules list: /* Now we can delete it from the lists */ mutex_lock(&module_mutex); stop_machine(__unlink_module, mod, NULL); mutex_unlock(&module_mutex); This is the sequence of events that leads to the panic. CPU 1 is removing dummy_module via delete_module(). It acquires the module_mutex, and then releases it. CPU 1 has NOT set dummy_module->state to MODULE_STATE_UNFORMED yet. CPU 0, which is reading the /proc/modules, acquires the module_mutex and acquires a pointer to the dummy_module which is still in the modules list. CPU 0 calls m_show for dummy_module. The check in m_show() for MODULE_STATE_UNFORMED passed for dummy_module even though it is being torn down. Meanwhile CPU 1, which has been continuing to remove dummy_module without holding the module_mutex, now calls free_module() and sets dummy_module->state to MODULE_STATE_UNFORMED. CPU 0 now calls module_flags() with dummy_module and ... static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); and BOOM. Acquire and release the module_mutex lock around the setting of MODULE_STATE_UNFORMED in the teardown path, which should resolve the problem. Testing: In the unpatched kernel I can panic the system within 1 minute by doing while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done and while (true) do cat /proc/modules; done in separate terminals. In the patched kernel I was able to run just over one hour without seeing any issues. I also verified the output of panic via sysrq-c and the output of /proc/modules looks correct for all three states for the dummy_module. dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-) dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE) dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+) Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
-
- 03 10月, 2014 1 次提交
-
-
由 Kyle McMartin 提交于
Similar to ARM, AArch64 is generating $x and $d syms... which isn't terribly helpful when looking at %pF output and the like. Filter those out in kallsyms, modpost and when looking at module symbols. Seems simplest since none of these check EM_ARM anyway, to just add it to the strchr used, rather than trying to make things overly complicated. initcall_debug improves: dmesg_before.txt: initcall $x+0x0/0x154 [sg] returned 0 after 26331 usecs dmesg_after.txt: initcall init_sg+0x0/0x154 [sg] returned 0 after 15461 usecs Signed-off-by: NKyle McMartin <kyle@redhat.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 27 8月, 2014 1 次提交
-
-
由 Jani Nikula 提交于
Make it clear this is about kernel_param_ops, not kernel_param (which will soon have a flags field of its own). No functional changes. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jean Delvare <khali@linux-fr.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Jon Mason <jon.mason@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 16 8月, 2014 1 次提交
-
-
由 Andy Lutomirski 提交于
The commit 4982223e module: set nx before marking module MODULE_STATE_COMING. introduced a regression: if a module fails to parse its arguments or if mod_sysfs_setup fails, then the module's memory will be freed while still read-only. Anything that reuses that memory will crash as soon as it tries to write to it. Cc: stable@vger.kernel.org # v3.16 Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 27 7月, 2014 1 次提交
-
-
由 Russell King 提交于
Symbols starting with .L are ELF local symbols and should not appear in ELF symbol tables. However, unfortunately ARM binutils leaks the .LANCHOR symbols into the symbol table, which leads kallsyms to report these symbols rather than the real name. It is not very useful when %pf reports symbols against these leaked .LANCHOR symbols. Arrange for kallsyms to ignore these symbols using the same mechanism that is used for the ARM mapping symbols. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-