- 11 7月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
Dan reported that the recent changes to the broadcast code introduced a potential NULL dereference. Add the proper check. Fixes: e0454311 "tick/broadcast: Sanity check the shutdown of the local clock_event" Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 09 7月, 2015 2 次提交
-
-
由 Peter Zijlstra 提交于
The load_module() error path frees a module but forgot to take it out of the mod_tree, leaving a dangling entry in the tree, causing havoc. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reported-by: NArthur Marsh <arthur.marsh@internode.on.net> Tested-by: NArthur Marsh <arthur.marsh@internode.on.net> Fixes: 93c2e105 ("module: Optimize __module_address() using a latched RB-tree") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Linus Torvalds 提交于
The "fix" in commit 0b08c5e5 ("audit: Fix check of return value of strnlen_user()") didn't fix anything, it broke things. As reported by Steven Rostedt: "Yes, strnlen_user() returns 0 on fault, but if you look at what len is set to, than you would notice that on fault len would be -1" because we just subtracted one from the return value. So testing against 0 doesn't test for a fault condition, it tests against a perfectly valid empty string. Also fix up the usual braindamage wrt using WARN_ON() inside a conditional - make it part of the conditional and remove the explicit unlikely() (which is already part of the WARN_ON*() logic, exactly so that you don't have to write unreadable code. Reported-and-tested-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Paul Moore <pmoore@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 7月, 2015 10 次提交
-
-
由 Thomas Gleixner 提交于
When a cpu goes up some architectures (e.g. x86) have to walk the irq space to set up the vector space for the cpu. While this needs extra protection at the architecture level we can avoid a few race conditions by preventing the concurrent allocation/free of irq descriptors and the associated data. When a cpu goes down it moves the interrupts which are targeted to this cpu away by reassigning the affinities. While this happens interrupts can be allocated and freed, which opens a can of race conditions in the code which reassignes the affinities because interrupt descriptors might be freed underneath. Example: CPU1 CPU2 cpu_up/down irq_desc = irq_to_desc(irq); remove_from_radix_tree(desc); raw_spin_lock(&desc->lock); free(desc); We could protect the irq descriptors with RCU, but that would require a full tree change of all accesses to interrupt descriptors. But fortunately these kind of race conditions are rather limited to a few things like cpu hotplug. The normal setup/teardown is very well serialized. So the simpler and obvious solution is: Prevent allocation and freeing of interrupt descriptors accross cpu hotplug. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: xiao jin <jin.xiao@intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Borislav Petkov <bp@suse.de> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Link: http://lkml.kernel.org/r/20150705171102.063519515@linutronix.de
-
由 Thomas Gleixner 提交于
Andriy reported that on a virtual machine the warning about negative expiry time in the clock events programming code triggered: hpet: hpet0 irq 40 for MSI hpet: hpet1 irq 41 for MSI Switching to clocksource hpet WARNING: at kernel/time/clockevents.c:239 [<ffffffff810ce6eb>] clockevents_program_event+0xdb/0xf0 [<ffffffff810cf211>] tick_handle_periodic_broadcast+0x41/0x50 [<ffffffff81016525>] timer_interrupt+0x15/0x20 When the second hpet is installed as a per cpu timer the broadcast event is not longer required and stopped, which sets the next_evt of the broadcast device to KTIME_MAX. If after that a spurious interrupt happens on the broadcast device, then the current code blindly handles it and tries to reprogram the broadcast device afterwards, which adds the period to next_evt. KTIME_MAX + period results in a negative expiry value causing the WARN_ON in the clockevents code to trigger. Add a proper check for the state of the broadcast device into the interrupt handler and return if the interrupt is spurious. [ Folded in pointer fix from Sudeep ] Reported-by: NAndriy Gapon <avg@FreeBSD.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20150705205221.802094647@linutronix.de
-
由 Thomas Gleixner 提交于
If the current cpu is the one which has the hrtimer based broadcast queued then we better return busy immediately instead of going through loops and hoops to figure that out. [ Split out from a larger combo patch ] Tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
由 Thomas Gleixner 提交于
Tell the idle code not to go deep if the broadcast IPI is about to arrive. [ Split out from a larger combo patch ] Tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
由 Thomas Gleixner 提交于
If the system is in periodic mode and the broadcast device is hrtimer based, return busy as we have no proper handling for this. [ Split out from a larger combo patch ] Tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
由 Thomas Gleixner 提交于
We need to check more than the periodic mode for proper operation in all runtime combinations. To avoid code duplication move the check into the enter state handling. No functional change. [ Split out from a larger combo patch ] Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
由 Thomas Gleixner 提交于
Add a check for a installed broadcast device to the oneshot control function and return busy if not. [ Split out from a larger combo patch ] Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
由 Thomas Gleixner 提交于
Currently the broadcast busy check, which prevents the idle code from going into deep idle, works only in one shot mode. If NOHZ and HIGHRES are off (config or command line) there is no sanity check at all, so under certain conditions cpus are allowed to go into deep idle, where the local timer stops, and are not woken up again because there is no broadcast timer installed or a hrtimer based broadcast device is not evaluated. Move tick_broadcast_oneshot_control() into the common code and provide proper subfunctions for the various config combinations. The common check in tick_broadcast_oneshot_control() is for the C3STOP misfeature flag of the local clock event device. If its not set, idle can proceed. If set, further checks are necessary. Provide checks for the trivial cases: - If broadcast is disabled in the config, then return busy - If oneshot mode (NOHZ/HIGHES) is disabled in the config, return busy if the broadcast device is hrtimer based. - If oneshot mode is enabled in the config call the original tick_broadcast_oneshot_control() function. That function needs extra checks which will be implemented in seperate patches. [ Split out from a larger combo patch ] Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
由 Thomas Gleixner 提交于
The broadcast code shuts down the local clock event unconditionally even if no broadcast device is installed or if the broadcast device is hrtimer based. Add proper sanity checks. [ Split out from a larger combo patch ] Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
由 Thomas Gleixner 提交于
The hrtimer based broadcast vehicle can cause a hrtimer recursion which went unnoticed until we changed the hrtimer expiry code to keep track of the currently running timer. local_timer_interrupt() local_handler() hrtimer_interrupt() expire_hrtimers() broadcast_hrtimer() send_ipis() local_handler() hrtimer_interrupt() .... Solution is simple: Prevent the local handler call from the broadcast code when the broadcast 'device' is hrtimer based. [ Split out from a larger combo patch ] Tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Suzuki Poulose <Suzuki.Poulose@arm.com> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Cc: Catalin Marinas <Catalin.Marinas@arm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
-
- 07 7月, 2015 1 次提交
-
-
由 Viresh Kumar 提交于
Its mandatory for the drivers to provide set_state_{oneshot|periodic}() (only if related modes are supported) and set_state_shutdown() callbacks today, if they are implementing the new set-state interface. But this leads to unnecessary noop callbacks for drivers which don't want to implement them. Over that, it will lead to a full function call for nothing really useful. Lets make all set-state callbacks optional. Suggested-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Link: http://lkml.kernel.org/r/1436256875-15562-1-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 06 7月, 2015 1 次提交
-
-
由 Peter Zijlstra 提交于
Its currently possible to drop the last refcount to the aux buffer from NMI context, which results in the expected fireworks. The refcounting needs a bigger overhaul, but to cure the immediate problem, delay the freeing by using an irq_work. Reviewed-and-tested-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Reported-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150618103249.GK19282@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 04 7月, 2015 5 次提交
-
-
由 Srikar Dronamraju 提交于
Commit 44dba3d5 ("sched: Refactor task_struct to use numa_faults instead of numa_* pointers") modified the way tsk->numa_faults stats are accounted. However that commit never touched show_numa_stats() that is displayed in /proc/pid/sched and thus the numbers displayed in /proc/pid/sched don't match the actual numbers. Fix it by making sure that /proc/pid/sched reflects the task fault numbers. Also add group fault stats too. Also couple of more modifications are added here: 1. Format changes: - Previously we would list two entries per node, one for private and one for shared. Also the home node info was listed in each entry. - Now preferred node, total_faults and current node are displayed separately. - Now there is one entry per node, that lists private,shared task and group faults. 2. Unit changes: - p->numa_pages_migrated was getting reset after every read of /proc/pid/sched. It's more useful to have absolute numbers since differential migrations between two accesses can be more easily calculated. Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1435252903-1081-4-git-send-email-srikar@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Srikar Dronamraju 提交于
Having the numa group ID in /proc/sched_debug helps to see how the numa groups have spread across the system. Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1435252903-1081-3-git-send-email-srikar@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Srikar Dronamraju 提交于
Currently print_cfs_rq() is declared in include/linux/sched.h. However it's not used outside kernel/sched. Hence move the declaration to kernel/sched/sched.h Also some functions are only available for CONFIG_SCHED_DEBUG=y. Hence move the declarations to within the #ifdef. Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1435252903-1081-2-git-send-email-srikar@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Naveen N. Rao 提交于
Both CONFIG_SCHEDSTATS=y and CONFIG_TASK_DELAY_ACCT=y track task sched_info, which results in ugly #if clauses. Simplify the code by introducing a synthethic CONFIG_SCHED_INFO switch, selected by both. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: a.p.zijlstra@chello.nl Cc: ricklind@us.ibm.com Link: http://lkml.kernel.org/r/8d19eef800811a94b0f91bcbeb27430a884d7433.1435255405.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Commit 1cde2930 ("sched/preempt: Add static_key() to preempt_notifiers") had two problems. First, the preempt-notifier API needs to sleep with the addition of the static_key, we do however need to hold off preemption while modifying the preempt notifier list, otherwise a preemption could observe an inconsistent list state. KVM correctly registers and unregisters preempt notifiers with preemption disabled, so the sleep caused dmesg splats. Second, KVM registers and unregisters preemption notifiers very often (in vcpu_load/vcpu_put). With a single uniprocessor guest the static key would move between 0 and 1 continuously, hitting the slow path on every userspace exit. To fix this, wrap the static_key inc/dec in a new API, and call it from KVM. Fixes: 1cde2930 ("sched/preempt: Add static_key() to preempt_notifiers") Reported-by: NPontus Fuchs <pontus.fuchs@gmail.com> Reported-by: NTakashi Iwai <tiwai@suse.de> Tested-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 7月, 2015 1 次提交
-
-
由 Linus Torvalds 提交于
It's a bug in our Makefile rules, make it show what the changing certificate list was, and make it a warning so that people actually see it. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 7月, 2015 8 次提交
-
-
由 Eric W. Biederman 提交于
This allows for better documentation in the code and it allows for a simpler and fully correct version of fs_fully_visible to be written. The mount points converted and their filesystems are: /sys/hypervisor/s390/ s390_hypfs /sys/kernel/config/ configfs /sys/kernel/debug/ debugfs /sys/firmware/efi/efivars/ efivarfs /sys/fs/fuse/connections/ fusectl /sys/fs/pstore/ pstore /sys/kernel/tracing/ tracefs /sys/fs/cgroup/ cgroup /sys/kernel/security/ securityfs /sys/fs/selinux/ selinuxfs /sys/fs/smackfs/ smackfs Cc: stable@vger.kernel.org Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
Add a magic sysctl table sysctl_mount_point that when used to create a directory forces that directory to be permanently empty. Update the code to use make_empty_dir_inode when accessing permanently empty directories. Update the code to not allow adding to permanently empty directories. Update /proc/sys/fs/binfmt_misc to be a permanently empty directory. Cc: stable@vger.kernel.org Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Thomas Gleixner 提交于
time.o gets rebuilt unconditionally due to a leftover Makefile rule which was placed there for development purposes. Remove it along with the commented out always rule in the toplevel Kbuild file. Fixes: 0a227985 'time: Move timeconst.h into include/generated' Reported-by; Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Nicholas Mc Guire <der.herr@hofr.at>
-
由 Pekka Enberg 提交于
Use kvfree() instead of open-coding it. Signed-off-by: NPekka Enberg <penberg@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Antonio Ospite 提交于
The comment about /dev/kmsg does not mention the additional values which may actually be exported, fix that. Also move up the part of the comment instructing the users to ignore these additional values, this way the reading is more fluent and logically compact. Signed-off-by: NAntonio Ospite <ao2@ao2.it> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lorenzo Stoakes 提交于
Fix kernel gcov support for GCC 5.1. Similar to commit a992bf83 ("gcov: add support for GCC 4.9"), this patch takes into account the existence of a new gcov counter (see gcc's gcc/gcov-counter.def.) Firstly, it increments GCOV_COUNTERS (to 10), which makes the data structure struct gcov_info compatible with GCC 5.1. Secondly, a corresponding counter function __gcov_merge_icall_topn (Top N value tracking for indirect calls) is included in base.c with the other gcov counters unused for kernel profiling. Signed-off-by: NLorenzo Stoakes <lstoakes@gmail.com> Cc: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Yuan Pengfei <coolypf@qq.com> Tested-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com> Reviewed-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 HATAYAMA Daisuke 提交于
Commit f06e5153 ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers") introduced "crash_kexec_post_notifiers" kernel boot option, which toggles wheather panic() calls crash_kexec() before panic_notifiers and dump kmsg or after. The problem is that the commit overlooks panic_on_oops kernel boot option. If it is enabled, crash_kexec() is called directly without going through panic() in oops path. To fix this issue, this patch adds a check to "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). Also, put a comment in kexec_should_crash() to explain not obvious things on this patch. Signed-off-by: NHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Acked-by: NBaoquan He <bhe@redhat.com> Tested-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 HATAYAMA Daisuke 提交于
For compatibility with the behaviour before the commit f06e5153 ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers"), the 2nd crash_kexec() should be called only if crash_kexec_post_notifiers is enabled. Note that crash_kexec() returns immediately if kdump crash kernel is not loaded, so in this case, this patch makes no functionality change, but the point is to make it explicit, from the caller panic() side, that the 2nd crash_kexec() does nothing. Signed-off-by: NHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Suggested-by: NIngo Molnar <mingo@kernel.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 6月, 2015 2 次提交
-
-
由 Stephen Rothwell 提交于
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
As Dan Streetman points out, the entire point of locking for is to stop sysfs accesses, so they're elided entirely in the !SYSFS case. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 27 6月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
The recent timer wheel rework removed the get/put_cpu_var() pair in the hotplug migration code, which results in: BUG: using smp_processor_id() in preemptible [00000000] code: hib.sh/2845 ... [<ffffffff810d4fa3>] timer_cpu_notify+0x53/0x12 That hunk is a leftover from an earlier iteration and went unnoticed so far. Restore the previous code which was obviously correct. Fixes: 0eeda71b 'timer: Replace timer base by a cpu index' Reported-and_tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 26 6月, 2015 8 次提交
-
-
由 Rik van Riel 提交于
There is a helpful comment in do_exit() that states we sync the mm's RSS info before statistics gathering. The function that does the statistics gathering is called right above that comment. Change the code to obey the comment. Signed-off-by: NRik van Riel <riel@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rasmus Villemoes 提交于
Part of the disassembly of do_blk_trace_setup: 231b: e8 00 00 00 00 callq 2320 <do_blk_trace_setup+0x50> 231c: R_X86_64_PC32 strlen+0xfffffffffffffffc 2320: eb 0a jmp 232c <do_blk_trace_setup+0x5c> 2322: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 2328: 48 83 c3 01 add $0x1,%rbx 232c: 48 39 d8 cmp %rbx,%rax 232f: 76 47 jbe 2378 <do_blk_trace_setup+0xa8> 2331: 41 80 3c 1c 2f cmpb $0x2f,(%r12,%rbx,1) 2336: 75 f0 jne 2328 <do_blk_trace_setup+0x58> 2338: 41 c6 04 1c 5f movb $0x5f,(%r12,%rbx,1) 233d: 4c 89 e7 mov %r12,%rdi 2340: e8 00 00 00 00 callq 2345 <do_blk_trace_setup+0x75> 2341: R_X86_64_PC32 strlen+0xfffffffffffffffc 2345: eb e1 jmp 2328 <do_blk_trace_setup+0x58> Yep, that's right: gcc isn't smart enough to realize that replacing '/' by '_' cannot change the strlen(), so we call it again and again (at least when a '/' is found). Even if gcc were that smart, this construction would still loop over the string twice, once for the initial strlen() call and then the open-coded loop. Let's simply use strreplace() instead. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Liked-by: NJens Axboe <axboe@fb.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rasmus Villemoes 提交于
There's no point in starting over every time we see a ','... Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasily Averin 提交于
Patch fixes drawbacks in heck_syslog_permissions() noticed by AKPM: "from_file handling makes me cry. That's not a boolean - it's an enumerated value with two values currently defined. But the code in check_syslog_permissions() treats it as a boolean and also hardwires the knowledge that SYSLOG_FROM_PROC == 1 (or == `true`). And the name is wrong: it should be called from_proc to match SYSLOG_FROM_PROC." Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasily Averin 提交于
The final version of commit 637241a9 ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are processed incorrectly: - open of /dev/kmsg checks syslog access permissions by using check_syslog_permissions() where security_syslog() is not called if dmesg_restrict is set. - syslog syscall and /proc/kmsg calls do_syslog() where security_syslog can be executed twice (inside check_syslog_permissions() and then directly in do_syslog()) With this patch security_syslog() is called once only in all syslog-related operations regardless of dmesg_restrict value. Fixes: 637241a9 ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg") Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
printk log_buf keeps various metadata for each message including its sequence number and timestamp. The metadata is currently available only through /dev/kmsg and stripped out before passed onto console drivers. We want this metadata to be available to console drivers too so that console consumers can get full information including the metadata and dictionary, which among other things can be used to detect whether messages got lost in transit. This patch implements support for extended console drivers. Consoles can indicate that they want extended messages by setting the new CON_EXTENDED flag and they'll be fed messages formatted the same way as /dev/kmsg. "<level>,<sequnum>,<timestamp>,<contflag>;<message text>\n" If extended consoles exist, in-kernel fragment assembly is disabled. This ensures that all messages emitted to consoles have full metadata including sequence number. The contflag carries enough information to reassemble the fragments from the reader side trivially. Note that this only affects /dev/kmsg. Regular console and /proc/kmsg outputs are not affected by this change. * Extended message formatting for console drivers is enabled iff there are registered extended consoles. * Comment describing /dev/kmsg message format updated to add missing contflag field and help distinguishing variable from verbatim terms. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Miller <davem@davemloft.net> Cc: Kay Sievers <kay@vrfy.org> Reviewed-by: NPetr Mladek <pmladek@suse.cz> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
The extended message formatting used for /dev/kmsg will be used implement extended consoles. Factor out msg_print_ext_header() and msg_print_ext_body() from devkmsg_read(). This is pure restructuring. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Miller <davem@davemloft.net> Cc: Kay Sievers <kay@vrfy.org> Reviewed-by: NPetr Mladek <pmladek@suse.cz> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
This patchset updates netconsole so that it can emit messages with the same header as used in /dev/kmsg which gives neconsole receiver full log information which enables things like structured logging and detection of lost messages. This patch (of 7): devkmsg_read() uses 8k buffer and assumes that the formatted output message won't overrun which seems safe given LOG_LINE_MAX, the current use of dict and the escaping method being used; however, we're planning to use devkmsg formatting wider and accounting for the buffer size properly isn't that complicated. This patch defines CONSOLE_EXT_LOG_MAX as 8192 and updates devkmsg_read() so that it limits output accordingly. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Miller <davem@davemloft.net> Cc: Kay Sievers <kay@vrfy.org> Reviewed-by: NPetr Mladek <pmladek@suse.cz> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-