- 09 3月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
Workqueues are used extensively throughout the kernel but sometimes it's difficult to debug stalls involving work items because visibility into its inner workings is fairly limited. Although sysrq-t task dump annotates each active worker task with the information on the work item being executed, it is challenging to find out which work items are pending or delayed on which queues and how pools are being managed. This patch implements show_workqueue_state() which dumps all busy workqueues and pools and is called from the sysrq-t handler. At the end of sysrq-t dump, something like the following is printed. Showing busy workqueues and worker pools: ... workqueue filler_wq: flags=0x0 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256 in-flight: 491:filler_workfn, 507:filler_workfn pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 in-flight: 501:filler_workfn pending: filler_workfn ... workqueue test_wq: flags=0x8 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1 in-flight: 510(RESCUER):test_workfn BAR(69) BAR(500) delayed: test_workfn1 BAR(492), test_workfn2 ... pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 137 pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=3 manager: 469 pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 idle: 16 pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 62 The above shows that test_wq is executing test_workfn() on pid 510 which is the rescuer and also that there are two tasks 69 and 500 waiting for the work item to finish in flush_work(). As test_wq has max_active of 1, there are two work items for test_workfn1() and test_workfn2() which are delayed till the current work item is finished. In addition, pid 492 is flushing test_workfn1(). The work item for test_workfn() is being executed on pwq of pool 2 which is the normal priority per-cpu pool for CPU 1. The pool has three workers, two of which are executing filler_workfn() for filler_wq and the last one is assuming the manager role trying to create more workers. This extra workqueue state dump will hopefully help chasing down hangs involving workqueues. v3: cpulist_pr_cont() replaced with "%*pbl" printf formatting. v2: As suggested by Andrew, minor formatting change in pr_cont_work(), printk()'s replaced with pr_info()'s, and cpumask printing now uses cpulist_pr_cont(). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> CC: Ingo Molnar <mingo@redhat.com>
-
- 12 2月, 2015 2 次提交
-
-
由 Michal Hocko 提交于
Commit 5695be14 ("OOM, PM: OOM killed task shouldn't escape PM suspend") has left a race window when OOM killer manages to note_oom_kill after freeze_processes checks the counter. The race window is quite small and really unlikely and partial solution deemed sufficient at the time of submission. Tejun wasn't happy about this partial solution though and insisted on a full solution. That requires the full OOM and freezer's task freezing exclusion, though. This is done by this patch which introduces oom_sem RW lock and turns oom_killer_disable() into a full OOM barrier. oom_killer_disabled check is moved from the allocation path to the OOM level and we take oom_sem for reading for both the check and the whole OOM invocation. oom_killer_disable() takes oom_sem for writing so it waits for all currently running OOM killer invocations. Then it disable all the further OOMs by setting oom_killer_disabled and checks for any oom victims. Victims are counted via mark_tsk_oom_victim resp. unmark_oom_victim. The last victim wakes up all waiters enqueued by oom_killer_disable(). Therefore this function acts as the full OOM barrier. The page fault path is covered now as well although it was assumed to be safe before. As per Tejun, "We used to have freezing points deep in file system code which may be reacheable from page fault." so it would be better and more robust to not rely on freezing points here. Same applies to the memcg OOM killer. out_of_memory tells the caller whether the OOM was allowed to trigger and the callers are supposed to handle the situation. The page allocation path simply fails the allocation same as before. The page fault path will retry the fault (more on that later) and Sysrq OOM trigger will simply complain to the log. Normally there wouldn't be any unfrozen user tasks after try_to_freeze_tasks so the function will not block. But if there was an OOM killer racing with try_to_freeze_tasks and the OOM victim didn't finish yet then we have to wait for it. This should complete in a finite time, though, because - the victim cannot loop in the page fault handler (it would die on the way out from the exception) - it cannot loop in the page allocator because all the further allocation would fail and __GFP_NOFAIL allocations are not acceptable at this stage - it shouldn't be blocked on any locks held by frozen tasks (try_to_freeze expects lockless context) and kernel threads and work queues are not frozen yet Signed-off-by: NMichal Hocko <mhocko@suse.cz> Suggested-by: NTejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
While touching this area let's convert printk to pr_*. This also makes the printing of continuation lines done properly. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NTejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 8月, 2014 1 次提交
-
-
由 David Rientjes 提交于
With memoryless node support being worked on, it's possible that for optimizations that a node may not have a non-NULL zonelist. When CONFIG_NUMA is enabled and node 0 is memoryless, this means the zonelist for first_online_node may become NULL. The oom killer requires a zonelist that includes all memory zones for the sysrq trigger and pagefault out of memory handler. Ensure that a non-NULL zonelist is always passed to the oom killer. [akpm@linux-foundation.org: fix non-numa build] Signed-off-by: NDavid Rientjes <rientjes@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 6月, 2014 2 次提交
-
-
由 Rik van Riel 提交于
Some sysrq handlers can run for a long time, because they dump a lot of data onto a serial console. Having RCU stall warnings pop up in the middle of them only makes the problem worse. This patch temporarily disables RCU stall warnings while a sysrq request is handled. Signed-off-by: NRik van Riel <riel@redhat.com> Suggested-by: NPaul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Madper Xie <cxie@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rik van Riel 提交于
Echoing values into /proc/sysrq-trigger seems to be a popular way to get information out of the kernel. However, dumping information about thousands of processes, or hundreds of CPUs to serial console can result in IRQs being blocked for minutes, resulting in various kinds of cascade failures. The most common failure is due to interrupts being blocked for a very long time. This can lead to things like failed IO requests, and other things the system cannot easily recover from. This problem is easily fixable by making __handle_sysrq use RCU instead of spin_lock_irqsave. This leaves the warning that RCU grace periods have not elapsed for a long time, but the system will come back from that automatically. It also leaves sysrq-from-irq-context when the sysrq keys are pressed, but that is probably desired since people want that to work in situations where the system is already hosed. The callers of register_sysrq_key and unregister_sysrq_key appear to be capable of sleeping. Signed-off-by: NRik van Riel <riel@redhat.com> Reported-by: NMadper Xie <cxie@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 6月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
... instead of naked numbers. Stuff in sysrq.c used to set it to 8 which is supposed to mean above default level so set it to DEBUG instead as we're terminating/killing all tasks and we want to be verbose there. Also, correct the check in x86_64_start_kernel which should be >= as we're clearly issuing the string there for all debug levels, not only the magical 10. Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NKees Cook <keescook@chromium.org> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Joe Perches <joe@perches.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2013 1 次提交
-
-
由 Ben Hutchings 提交于
Turn the initial value of sysctl kernel.sysrq (SYSRQ_DEFAULT_ENABLE) into a Kconfig variable. Original version by Bastian Blank <waldi@debian.org>. Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 13 8月, 2013 1 次提交
-
-
由 Mathieu J. Poirier 提交于
Adding a simple device tree binding for the specification of key sequences. Definition of the keys found in the sequence are located in 'include/uapi/linux/input.h'. For the sysrq driver, holding the sequence of keys down for a specific amount of time will reset the system. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NGrant Likely <grant.likely@linaro.org> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
-
- 07 6月, 2013 1 次提交
-
-
由 Mathieu J. Poirier 提交于
Attempt to reboot the system gracefully when a key combo is detected. If the reste combination is pressed the 2nd time we assume that graceful reboot failed and perform emergency reboot. This fucntionality is useful when UI is stuck but the system is otherwise working fine. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
-
- 04 6月, 2013 1 次提交
-
-
由 Jingoo Han 提交于
The usage of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: NJingoo Han <jg1.han@samsung.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 02 4月, 2013 1 次提交
-
-
由 Mathieu J. Poirier 提交于
Some devices have too few buttons, which it makes it hard to have a reset combo that won't trigger automatically. As such a timeout functionality that requires the combination to be held for a given amount of time before triggering is introduced. If a key combo is recognized and held for a 'timeout' amount of time, the system triggers a reset. If the timeout value is omitted the driver simply ignores the functionality. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
-
- 16 3月, 2013 1 次提交
-
-
由 zhangwei(Jovi) 提交于
Currently help message of /proc/sysrq-trigger highlight its upper-case characters, like below: SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) ... this would confuse user trigger sysrq by upper-case character, which is inconsistent with the real lower-case character registed key. This inconsistent help message will also lead more confused when 26 upper-case letters put into use in future. This patch fix it. Thanks the comments from Andrew and Randy. Signed-off-by: Nzhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 28 2月, 2013 1 次提交
-
-
由 Linus Torvalds 提交于
When taking an address of an extern array, gcc quite naturally should be able to say "an address of an object can never be NULL" and just optimize away the test entirely. However, the new alternate sysrq reset code (commit 154b7a48: "Input: sysrq - allow specifying alternate reset sequence") did exactly that, and declared platform_sysrq_reset_seq[] as a weak array, and expecting that testing the address of the array would show whether it actually got linked against something or not. And that doesn't work with all gcc versions. Clearly it works with *some* versions of gcc, and maybe it's even supposed to work, but it really is a very fragile concept. So instead of testing the address of the weak variable, just create a weak instance of that array that is empty. If some platform then has a real platform_sysrq_reset_seq[] that overrides our weak one, the linker will switch to that one, and it all works without any run-time conditionals at all. Reported-by: NDave Airlie <airlied@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 2月, 2013 1 次提交
-
-
由 Clark Williams 提交于
Move rt scheduler definitions out of include/linux/sched.h into new file include/linux/sched/rt.h Signed-off-by: NClark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 1月, 2013 1 次提交
-
-
由 Mathieu Poirier 提交于
This patch adds keyreset functionality to the sysrq driver. It allows certain button/key combinations to be used in order to trigger emergency reboots. Redefining the '__weak platform_sysrq_reset_seq' variable is required to trigger the feature. Alternatively keys can be passed to the driver via a module parameter. This functionality comes from the keyreset driver submitted by Arve Hjønnevåg in the Android kernel. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
-
- 16 11月, 2012 1 次提交
-
-
由 David Rientjes 提交于
With hotpluggable and memoryless nodes, it's possible that node 0 will not be online, so use the first online node's zonelist rather than hardcoding node 0 to pass a zonelist with all zones to the oom killer. Signed-off-by: NDavid Rientjes <rientjes@google.com> Reviewed-by: NMichal Hocko <mhocko@suse.cz> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 17 10月, 2012 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 4月, 2012 1 次提交
-
-
由 Anton Vorontsov 提交于
Change send_sig_all() to use do_send_sig_info(SEND_SIG_FORCED) instead of force_sig(SIGKILL). With the recent changes we do not need force_ to kill the CLONE_NEWPID tasks. And this is more correct. force_sig() can race with the exiting thread, while do_send_sig_info(group => true) kill the whole process. Some more notes from Oleg Nesterov: > Just one note. This change makes no difference for sysrq_handle_kill(). > But it obviously changes the behaviour sysrq_handle_term(). I think > this is fine, if you want to really kill the task which blocks/ignores > SIGTERM you can use sysrq_handle_kill(). > > Even ignoring the reasons why force_sig() is simply wrong here, > force_sig(SIGTERM) looks strange. The task won't be killed if it has > a handler, but SIG_IGN can't help. However if it has the handler > but blocks SIGTERM temporary (this is very common) it will be killed. Also, > force_sig() can't kill the process if the main thread has already > exited. IOW, it is trivial to create the process which can't be > killed by sysrq. So, this patch fixes the issue. Suggested-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org> Cc: Alan Cox <alan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 3月, 2012 1 次提交
-
-
由 David Rientjes 提交于
The oom killer chooses not to kill a thread if: - an eligible thread has already been oom killed and has yet to exit, and - an eligible thread is exiting but has yet to free all its memory and is not the thread attempting to currently allocate memory. SysRq+F manually invokes the global oom killer to kill a memory-hogging task. This is normally done as a last resort to free memory when no progress is being made or to test the oom killer itself. For both uses, we always want to kill a thread and never defer. This patch causes SysRq+F to always kill an eligible thread and can be used to force a kill even if another oom killed thread has failed to exit. Signed-off-by: NDavid Rientjes <rientjes@google.com> Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: NPekka Enberg <penberg@kernel.org> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 3月, 2012 1 次提交
-
-
由 Alan Cox 提交于
Keyboard struct lifetime is easy, but the locking is not and is completely ignored by the existing code. Tackle this one head on - Make the kbd_table private so we can run down all direct users - Hoick the relevant ioctl handlers into the keyboard layer - Lock them with the keyboard lock so they don't change mid keypress - Add helpers for things like console stop/start so we isolate the poking around properly - Tweak the braille console so it still builds There are a couple of FIXME locking cases left for ioctls that are so hideous they should be addressed in a later patch. After this patch the kbd_table is private and all the keyboard jiggery pokery is in one place. This update fixes speakup and also a memory leak in the original. Signed-off-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 2月, 2012 2 次提交
-
-
由 Anton Vorontsov 提交于
There's a real possibility of killing kernel threads that might have issued use_mm(), so kthread's mm might become non-NULL. This patch fixes the issue by checking for PF_KTHREAD (just as get_task_mm()). Suggested-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Anton Vorontsov 提交于
sysrq should grab the tasklist lock, otherwise calling force_sig() is not safe, as it might race with exiting task, which ->sighand might be set to NULL already. Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org> Acked-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 04 1月, 2012 1 次提交
-
-
由 Al Viro 提交于
Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export kill_bdev as well, so brd doesn't have to open code it. Reduce buffer_head.h requirement accordingly. Removed a rather large comment from invalidate_bdev, as it looked a bit obsolete to bother moving. The small comment replacing it says enough. Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 3月, 2011 1 次提交
-
-
由 David Rientjes 提交于
Commit ddd588b5 ("oom: suppress nodes that are not allowed from meminfo on oom kill") moved lib/show_mem.o out of lib/lib.a, which resulted in build warnings on all architectures that implement their own versions of show_mem(): lib/lib.a(show_mem.o): In function `show_mem': show_mem.c:(.text+0x1f4): multiple definition of `show_mem' arch/sparc/mm/built-in.o:(.text+0xd70): first defined here The fix is to remove __show_mem() and add its argument to show_mem() in all implementations to prevent this breakage. Architectures that implement their own show_mem() actually don't do anything with the argument yet, but they could be made to filter nodes that aren't allowed in the current context in the future just like the generic implementation. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Reported-by: NJames Bottomley <James.Bottomley@hansenpartnership.com> Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 11月, 2010 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
The tty code should be in its own subdirectory and not in the char driver with all of the cruft that is currently there. Based on work done by Arnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 15 10月, 2010 1 次提交
-
-
由 Arnd Bergmann 提交于
All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
-
- 30 9月, 2010 1 次提交
-
-
由 Dmitry Torokhov 提交于
Similarly to the keyboard handler, we are called by different input devices and thus need to add spinlock if we want to maintain our state properly. Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
- 21 8月, 2010 1 次提交
-
-
由 Dmitry Torokhov 提交于
Sysrq operations do not accept tty argument anymore so no need to pass it to us. [Stephen Rothwell <sfr@canb.auug.org.au>: fix build breakage in drm code caused by sysrq using bool but not including linux/types.h] [Sachin Sant <sachinp@in.ibm.com>: fix build breakage in s390 keyboadr driver] Acked-by: NAlan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: NJason Wessel <jason.wessel@windriver.com> Acked-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
- 20 8月, 2010 1 次提交
-
-
由 Dmitry Torokhov 提交于
Noone is using tty argument so let's get rid of it. Acked-by: NAlan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: NJason Wessel <jason.wessel@windriver.com> Acked-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
- 22 7月, 2010 1 次提交
-
-
由 Jason Wessel 提交于
The kdb code should not toggle the sysrq state in case an end user wants to try and resume the normal kernel execution. Signed-off-by: NJason Wessel <jason.wessel@windriver.com> Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
-
- 09 6月, 2010 1 次提交
-
-
由 Dmitry Torokhov 提交于
This shoud fix the problem with SysRq mode staying half-way enabled and interfereing with normal PrtScrn operation after user presses ALT for the first time. Reported-and-tested-by: NÉric Piel <E.A.B.Piel@tudelft.nl> Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
- 22 4月, 2010 1 次提交
-
-
由 Frederic Weisbecker 提交于
The ftrace_dump_on_oops kernel parameter, sysctl and sysrq let one dump every cpu buffers when an oops or panic happens. It's nice when you have few cpus but it may take ages if have many, plus you miss the real origin of the problem in all the cpu traces. Sometimes, all you need is to dump the cpu buffer that triggered the opps, most of the time it is our main interest. This patch modifies ftrace_dump_on_oops to handle this choice. The ftrace_dump_on_oops kernel parameter, when it comes alone, has the same behaviour than before. But ftrace_dump_on_oops=orig_cpu will only dump the buffer of the cpu that oops'ed. Similarly, sysctl kernel.ftrace_dump_on_oops=1 and echo 1 > /proc/sys/kernel/ftrace_dump_on_oops keep their previous behaviour. But setting 2 jumps into cpu origin dump mode. v2: Fix double setup v3: Fix spelling issues reported by Randy Dunlap v4: Also update __ftrace_dump in the selftests Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
-
- 14 4月, 2010 1 次提交
-
-
由 Dmitry Torokhov 提交于
Instead of keeping SysRq support inside of legacy keyboard driver split it out into a separate input handler (filter). This stops most SysRq input events from leaking into evdev clients (some events, such as first SysRq scancode - not keycode - event, are still leaked into both legacy keyboard and evdev). [martinez.javier@gmail.com: fix compile error when CONFIG_MAGIC_SYSRQ is not defined] Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 16 12月, 2009 1 次提交
-
-
由 KAMEZAWA Hiroyuki 提交于
Fix node-oriented allocation handling in oom-kill.c I myself think of this as a bugfix not as an ehnancement. In these days, things are changed as - alloc_pages() eats nodemask as its arguments, __alloc_pages_nodemask(). - mempolicy don't maintain its own private zonelists. (And cpuset doesn't use nodemask for __alloc_pages_nodemask()) So, current oom-killer's check function is wrong. This patch does - check nodemask, if nodemask && nodemask doesn't cover all node_states[N_HIGH_MEMORY], this is CONSTRAINT_MEMORY_POLICY. - Scan all zonelist under nodemask, if it hits cpuset's wall this faiulre is from cpuset. And - modifies the caller of out_of_memory not to call oom if __GFP_THISNODE. This doesn't change "current" behavior. If callers use __GFP_THISNODE it should handle "page allocation failure" by itself. - handle __GFP_NOFAIL+__GFP_THISNODE path. This is something like a FIXME but this gfpmask is not used now. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hioryu@jp.fujitsu.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 9月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: NStephane Eranian <eranian@google.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPaul Mackerras <paulus@samba.org> Reviewed-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 8月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
As Andrew noted, my previous patch ("debug lockups: Improve lockup detection") broke/removed SysRq-L support from architecture that do not provide a __trigger_all_cpu_backtrace implementation. Restore a fallback path and clean up the SysRq-L machinery a bit: - Rename the arch method to arch_trigger_all_cpu_backtrace() - Simplify the define - Document the method a bit - in the hope of more architectures adding support for it. [ The patch touches Sparc code for the rename. ] Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> LKML-Reference: <20090802140809.7ec4bb6b.akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 8月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
When debugging a recent lockup bug i found various deficiencies in how our current lockup detection helpers work: - SysRq-L is not very efficient as it uses a workqueue, hence it cannot punch through hard lockups and cannot see through most soft lockups either. - The SysRq-L code depends on the NMI watchdog - which is off by default. - We dont print backtraces from the RCU code's built-in 'RCU state machine is stuck' debug code. This debug code tends to be one of the first (and only) mechanisms that show that a lockup has occured. This patch changes the code so taht we: - Trigger the NMI backtrace code from SysRq-L instead of using a workqueue (which cannot punch through hard lockups) - Trigger print-all-CPU-backtraces from the RCU lockup detection code Also decouple the backtrace printing code from the NMI watchdog: - Dont use variable size cpumasks (it might not be initialized and they are a bit more fragile anyway) - Trigger an NMI immediately via an IPI, instead of waiting for the NMI tick to occur. This is a lot faster and can produce more relevant backtraces. It will also work if the NMI watchdog is disabled. - Dont print the 'dazed and confused' message when we print a backtrace from the NMI - Do a show_regs() plus a dump_stack() to get maximum info out of the dump. Worst-case we get two stacktraces - which is not a big deal. Sometimes, if register content is corrupted, the precise stack walker in show_regs() wont give us a full backtrace - in this case dump_stack() will do it. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 7月, 2009 1 次提交
-
-
由 Hidetoshi Seto 提交于
commit d6580a9f ("kexec: sysrq: simplify sysrq-c handler") changed the behavior of sysrq-c to unconditional dereference of NULL pointer. So in cases with CONFIG_KEXEC, where crash_kexec() was directly called from sysrq-c before, now it can be said that a step of "real oops" was inserted before starting kdump. However, in contrast to oops via SysRq-c from keyboard which results in panic due to in_interrupt(), oops via "echo c > /proc/sysrq-trigger" will not become panic unless panic_on_oops=1. It means that even if dump is properly configured to be taken on panic, the sysrq-c from proc interface might not start crashdump while the sysrq-c from keyboard can start crashdump. This confuses traditional users of kdump, i.e. people who expect sysrq-c to do common behavior in both of the keyboard and proc interface. This patch brings the keyboard and proc interface behavior of sysrq-c in line, by forcing panic_on_oops=1 before oops in sysrq-c handler. And some updates in documentation are included, to clarify that there is no longer dependency with CONFIG_KEXEC, and that now the system can just crash by sysrq-c if no dump mechanism is configured. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Cc: Brayan Arraes <brayan@yack.com.br> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-