1. 15 7月, 2013 2 次提交
    • P
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      
      [1] https://lkml.org/lkml/2013/5/20/589Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      0db0628d
    • P
      rcu: delete __cpuinit usage from all rcu files · 49fb4c62
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the drivers/rcu uses of the __cpuinit macros
      from all C files.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@freedesktop.org>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      49fb4c62
  2. 14 7月, 2013 1 次提交
  3. 12 7月, 2013 6 次提交
    • P
      sched: Fix HRTICK · 971ee28c
      Peter Zijlstra 提交于
      David reported that the HRTICK sched feature was borken; which was enough
      motivation for me to finally fix it ;-)
      
      We should not allow hrtimer code to do softirq wakeups while holding scheduler
      locks. The hrtimer code only needs this when we accidentally try to program an
      expired time. We don't much care about those anyway since we have the regular
      tick to fall back to.
      Reported-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130628091853.GE29209@dyad.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      971ee28c
    • S
      tick: broadcast: Check broadcast mode on CPU hotplug · a272dcca
      Stephen Boyd 提交于
      On ARM systems the dummy clockevent is registered with the cpu
      hotplug notifier chain before any other per-cpu clockevent. This
      has the side-effect of causing the dummy clockevent to be
      registered first in every hotplug sequence. Because the dummy is
      first, we'll try to turn the broadcast source on but the code in
      tick_device_uses_broadcast() assumes the broadcast source is in
      periodic mode and calls tick_broadcast_start_periodic()
      unconditionally.
      
      On boot this isn't a problem because we typically haven't
      switched into oneshot mode yet (if at all). During hotplug, if
      the broadcast source isn't in periodic mode we'll replace the
      broadcast oneshot handler with the broadcast periodic handler and
      start emulating oneshot mode when we shouldn't. Due to the way
      the broadcast oneshot handler programs the next_event it's
      possible for it to contain KTIME_MAX and cause us to hang the
      system when the periodic handler tries to program the next tick.
      Fix this by using the appropriate function to start the broadcast
      source.
      Reported-by: NStephen Warren <swarren@nvidia.com>
      Tested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Cc: Mark Rutland <Mark.Rutland@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: ARM kernel mailing list <linux-arm-kernel@lists.infradead.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Joseph Lo <josephl@nvidia.com>
      Link: http://lkml.kernel.org/r/20130711140059.GA27430@codeaurora.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      a272dcca
    • M
      mutex: Move ww_mutex definitions to ww_mutex.h · 1b375dc3
      Maarten Lankhorst 提交于
      Move the definitions for wound/wait mutexes out to a separate
      header, ww_mutex.h. This reduces clutter in mutex.h, and
      increases readability.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Cc: Dave Airlie <airlied@gmail.com>
      Link: http://lkml.kernel.org/r/51D675DC.3000907@canonical.com
      [ Tidied up the code a bit. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1b375dc3
    • P
      perf: Fix perf_lock_task_context() vs RCU · 058ebd0e
      Peter Zijlstra 提交于
      Jiri managed to trigger this warning:
      
       [] ======================================================
       [] [ INFO: possible circular locking dependency detected ]
       [] 3.10.0+ #228 Tainted: G        W
       [] -------------------------------------------------------
       [] p/6613 is trying to acquire lock:
       []  (rcu_node_0){..-...}, at: [<ffffffff810ca797>] rcu_read_unlock_special+0xa7/0x250
       []
       [] but task is already holding lock:
       []  (&ctx->lock){-.-...}, at: [<ffffffff810f2879>] perf_lock_task_context+0xd9/0x2c0
       []
       [] which lock already depends on the new lock.
       []
       [] the existing dependency chain (in reverse order) is:
       []
       [] -> #4 (&ctx->lock){-.-...}:
       [] -> #3 (&rq->lock){-.-.-.}:
       [] -> #2 (&p->pi_lock){-.-.-.}:
       [] -> #1 (&rnp->nocb_gp_wq[1]){......}:
       [] -> #0 (rcu_node_0){..-...}:
      
      Paul was quick to explain that due to preemptible RCU we cannot call
      rcu_read_unlock() while holding scheduler (or nested) locks when part
      of the read side critical section was preemptible.
      
      Therefore solve it by making the entire RCU read side non-preemptible.
      
      Also pull out the retry from under the non-preempt to play nice with RT.
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Helped-out-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      058ebd0e
    • J
      perf: Remove WARN_ON_ONCE() check in __perf_event_enable() for valid scenario · 06f41796
      Jiri Olsa 提交于
      The '!ctx->is_active' check has a valid scenario, so
      there's no need for the warning.
      
      The reason is that there's a time window between the
      'ctx->is_active' check in the perf_event_enable() function
      and the __perf_event_enable() function having:
      
        - IRQs on
        - ctx->lock unlocked
      
      where the task could be killed and 'ctx' deactivated by
      perf_event_exit_task(), ending up with the warning below.
      
      So remove the WARN_ON_ONCE() check and add comments to
      explain it all.
      
      This addresses the following warning reported by Vince Weaver:
      
      [  324.983534] ------------[ cut here ]------------
      [  324.984420] WARNING: at kernel/events/core.c:1953 __perf_event_enable+0x187/0x190()
      [  324.984420] Modules linked in:
      [  324.984420] CPU: 19 PID: 2715 Comm: nmi_bug_snb Not tainted 3.10.0+ #246
      [  324.984420] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
      [  324.984420]  0000000000000009 ffff88043fce3ec8 ffffffff8160ea0b ffff88043fce3f00
      [  324.984420]  ffffffff81080ff0 ffff8802314fdc00 ffff880231a8f800 ffff88043fcf7860
      [  324.984420]  0000000000000286 ffff880231a8f800 ffff88043fce3f10 ffffffff8108103a
      [  324.984420] Call Trace:
      [  324.984420]  <IRQ>  [<ffffffff8160ea0b>] dump_stack+0x19/0x1b
      [  324.984420]  [<ffffffff81080ff0>] warn_slowpath_common+0x70/0xa0
      [  324.984420]  [<ffffffff8108103a>] warn_slowpath_null+0x1a/0x20
      [  324.984420]  [<ffffffff81134437>] __perf_event_enable+0x187/0x190
      [  324.984420]  [<ffffffff81130030>] remote_function+0x40/0x50
      [  324.984420]  [<ffffffff810e51de>] generic_smp_call_function_single_interrupt+0xbe/0x130
      [  324.984420]  [<ffffffff81066a47>] smp_call_function_single_interrupt+0x27/0x40
      [  324.984420]  [<ffffffff8161fd2f>] call_function_single_interrupt+0x6f/0x80
      [  324.984420]  <EOI>  [<ffffffff816161a1>] ? _raw_spin_unlock_irqrestore+0x41/0x70
      [  324.984420]  [<ffffffff8113799d>] perf_event_exit_task+0x14d/0x210
      [  324.984420]  [<ffffffff810acd04>] ? switch_task_namespaces+0x24/0x60
      [  324.984420]  [<ffffffff81086946>] do_exit+0x2b6/0xa40
      [  324.984420]  [<ffffffff8161615c>] ? _raw_spin_unlock_irq+0x2c/0x30
      [  324.984420]  [<ffffffff81087279>] do_group_exit+0x49/0xc0
      [  324.984420]  [<ffffffff81096854>] get_signal_to_deliver+0x254/0x620
      [  324.984420]  [<ffffffff81043057>] do_signal+0x57/0x5a0
      [  324.984420]  [<ffffffff8161a164>] ? __do_page_fault+0x2a4/0x4e0
      [  324.984420]  [<ffffffff8161665c>] ? retint_restore_args+0xe/0xe
      [  324.984420]  [<ffffffff816166cd>] ? retint_signal+0x11/0x84
      [  324.984420]  [<ffffffff81043605>] do_notify_resume+0x65/0x80
      [  324.984420]  [<ffffffff81616702>] retint_signal+0x46/0x84
      [  324.984420] ---[ end trace 442ec2f04db3771a ]---
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1373384651-6109-2-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      06f41796
    • J
      perf: Clone child context from parent context pmu · 734df5ab
      Jiri Olsa 提交于
      Currently when the child context for inherited events is
      created, it's based on the pmu object of the first event
      of the parent context.
      
      This is wrong for the following scenario:
      
        - HW context having HW and SW event
        - HW event got removed (closed)
        - SW event stays in HW context as the only event
          and its pmu is used to clone the child context
      
      The issue starts when the cpu context object is touched
      based on the pmu context object (__get_cpu_context). In
      this case the HW context will work with SW cpu context
      ending up with following WARN below.
      
      Fixing this by using parent context pmu object to clone
      from child context.
      
      Addresses the following warning reported by Vince Weaver:
      
      [ 2716.472065] ------------[ cut here ]------------
      [ 2716.476035] WARNING: at kernel/events/core.c:2122 task_ctx_sched_out+0x3c/0x)
      [ 2716.476035] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs locn
      [ 2716.476035] CPU: 0 PID: 3164 Comm: perf_fuzzer Not tainted 3.10.0-rc4 #2
      [ 2716.476035] Hardware name: AOpen   DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BI2
      [ 2716.476035]  0000000000000000 ffffffff8102e215 0000000000000000 ffff88011fc18
      [ 2716.476035]  ffff8801175557f0 0000000000000000 ffff880119fda88c ffffffff810ad
      [ 2716.476035]  ffff880119fda880 ffffffff810af02a 0000000000000009 ffff880117550
      [ 2716.476035] Call Trace:
      [ 2716.476035]  [<ffffffff8102e215>] ? warn_slowpath_common+0x5b/0x70
      [ 2716.476035]  [<ffffffff810ab2bd>] ? task_ctx_sched_out+0x3c/0x5f
      [ 2716.476035]  [<ffffffff810af02a>] ? perf_event_exit_task+0xbf/0x194
      [ 2716.476035]  [<ffffffff81032a37>] ? do_exit+0x3e7/0x90c
      [ 2716.476035]  [<ffffffff810cd5ab>] ? __do_fault+0x359/0x394
      [ 2716.476035]  [<ffffffff81032fe6>] ? do_group_exit+0x66/0x98
      [ 2716.476035]  [<ffffffff8103dbcd>] ? get_signal_to_deliver+0x479/0x4ad
      [ 2716.476035]  [<ffffffff810ac05c>] ? __perf_event_task_sched_out+0x230/0x2d1
      [ 2716.476035]  [<ffffffff8100205d>] ? do_signal+0x3c/0x432
      [ 2716.476035]  [<ffffffff810abbf9>] ? ctx_sched_in+0x43/0x141
      [ 2716.476035]  [<ffffffff810ac2ca>] ? perf_event_context_sched_in+0x7a/0x90
      [ 2716.476035]  [<ffffffff810ac311>] ? __perf_event_task_sched_in+0x31/0x118
      [ 2716.476035]  [<ffffffff81050dd9>] ? mmdrop+0xd/0x1c
      [ 2716.476035]  [<ffffffff81051a39>] ? finish_task_switch+0x7d/0xa6
      [ 2716.476035]  [<ffffffff81002473>] ? do_notify_resume+0x20/0x5d
      [ 2716.476035]  [<ffffffff813654f5>] ? retint_signal+0x3d/0x78
      [ 2716.476035] ---[ end trace 827178d8a5966c3d ]---
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1373384651-6109-1-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      734df5ab
  4. 11 7月, 2013 1 次提交
  5. 10 7月, 2013 11 次提交
  6. 06 7月, 2013 1 次提交
  7. 05 7月, 2013 5 次提交
    • T
      hrtimers: Move SMP function call to thread context · 5ec2481b
      Thomas Gleixner 提交于
      smp_call_function_* must not be called from softirq context.
      
      But clock_was_set() which calls on_each_cpu() is called from softirq
      context to implement a delayed clock_was_set() for the timer interrupt
      handler. Though that almost never gets invoked. A recent change in the
      resume code uses the softirq based delayed clock_was_set to support
      Xens resume mechanism.
      
      linux-next contains a new warning which warns if smp_call_function_*
      is called from softirq context which gets triggered by that Xen
      change.
      
      Fix this by moving the delayed clock_was_set() call to a work context.
      Reported-and-tested-by: NArtem Savkov <artem.savkov@gmail.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>,
      Cc: Konrad Wilk <konrad.wilk@oracle.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: xen-devel@lists.xen.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      5ec2481b
    • A
      genirq: generic chip: Use DIV_ROUND_UP to calculate numchips · 002fca5d
      Axel Lin 提交于
      The number of interrupts in a domain may be not divisible by the
      number of interrupts each chip handles. Integer division may truncate
      the result, thus use DIV_ROUND_UP to count numchips.
      
      Seems all users of irq_alloc_domain_generic_chips() in current code do
      not have this issue. I just found the issue while reading the code.
      Signed-off-by: NAxel Lin <axel.lin@ingics.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Link: http://lkml.kernel.org/r/1373015592.18252.2.camel@phoenixSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      002fca5d
    • T
      clocksource: Reselect clocksource when watchdog validated high-res capability · 332962f2
      Thomas Gleixner 提交于
      Up to commit 5d33b883 (clocksource: Always verify highres capability)
      we had no sanity check when selecting a clocksource, which prevented
      that a non highres capable clocksource is used when the system already
      switched to highres/nohz mode.
      
      The new sanity check works as Alex and Tim found out. It prevents the
      TSC from being used. This happens because on x86 the boot process
      looks like this:
      
       tsc_start_freqency_validation(TSC);
       clocksource_register(HPET);
       clocksource_done_booting();
      	clocksource_select()
      		Selects HPET which is valid for high-res
      
       switch_to_highres();
      
       clocksource_register(TSC);
       	TSC is not selected, because it is not yet
      	flagged as VALID_HIGH_RES
      
       clocksource_watchdog()
      	Validates TSC for highres, but that does not make TSC
      	the current clocksource.
      
      Before the sanity check was added, we installed TSC unvalidated which
      worked most of the time. If the TSC was really detected as unstable,
      then the unstable logic removed it and installed HPET again.
      
      The sanity check is correct and needed. So the watchdog needs to kick
      a reselection of the clocksource, when it qualifies TSC as a valid
      high res clocksource.
      
      To solve this, we mark the clocksource which got the flag
      CLOCK_SOURCE_VALID_FOR_HRES set by the watchdog with an new flag
      CLOCK_SOURCE_RESELECT and trigger the watchdog thread. The watchdog
      thread evaluates the flag and invokes clocksource_select() when set.
      
      To avoid that the clocksource_done_booting() code, which is about to
      install the first real clocksource anyway, needs to go through
      clocksource_select and tick_oneshot_notify() pointlessly, split out
      the clocksource_watchdog_kthread() list walk code and invoke the
      select/notify only when called from clocksource_watchdog_kthread().
      
      So clocksource_done_booting() can utilize the same splitout code
      without the select/notify invocation and the clocksource_mutex
      unlock/relock dance.
      Reported-and-tested-by: NAlex Shi <alex.shi@intel.com>
      Cc: Hans Peter Anvin <hpa@linux.intel.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <andi.kleen@intel.com>
      Tested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307042239150.11637@ionos.tec.linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      332962f2
    • S
      perf: Fix interrupt handler timing harness · e5302920
      Stephane Eranian 提交于
      This patch fixes a serious bug in:
      
        14c63f17 perf: Drop sample rate when sampling is too slow
      
      There was an misunderstanding on the API of the do_div()
      macro. It returns the remainder of the division and this
      was not what the function expected leading to disabling the
      interrupt latency watchdog.
      
      This patch also remove a duplicate assignment in
      perf_sample_event_took().
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: dave.hansen@linux.intel.com
      Cc: ak@linux.intel.com
      Cc: jolsa@redhat.com
      Link: http://lkml.kernel.org/r/20130704223010.GA30625@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e5302920
    • K
      posix-cpu-timers: don't account cpu timer after stopped thread runtime accounting · fa18f7bd
      KOSAKI Motohiro 提交于
      When tsk->signal->cputimer->running is 1, signal->cputimer (i.e. per process
      timer account) and tsk->sum_sched_runtime (i.e. per thread timer account)
      increase at the same pace because update_curr() increases both accounting.
      
      However, there is one exception. When thread exiting, __exit_signal() turns
      over task's sum_shced_runtime to sig->sum_sched_runtime, but it doesn't stop
      signal->cputimer accounting.
      
      This inconsistency makes POSIX timer wake up too early. This patch fixes it.
      Original-patch-by: NOlivier Langlois <olivier@trillion01.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NOlivier Langlois <olivier@trillion01.com>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      fa18f7bd
  8. 04 7月, 2013 13 次提交