1. 29 5月, 2013 5 次提交
    • T
      genirq: irqchip: Add a mask calculation function · d0051816
      Thomas Gleixner 提交于
      Some chips have weird bit mask access patterns instead of the linear
      you expect. Allow them to calculate the cached mask themself.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jean-Francois Moine <moinejf@free.fr>
      Cc: devicetree-discuss@lists.ozlabs.org
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Gregory Clement <gregory.clement@free-electrons.com>
      Cc: Gerlando Falauto <gerlando.falauto@keymile.com>
      Cc: Rob Landley <rob@landley.net>
      Acked-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
      Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Link: http://lkml.kernel.org/r/20130506142539.302898834@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d0051816
    • T
      genirq: Generic chip: Cache per irq bit mask · 966dc736
      Thomas Gleixner 提交于
      Cache the per irq bit mask instead of recalculating it over and over.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jean-Francois Moine <moinejf@free.fr>
      Cc: devicetree-discuss@lists.ozlabs.org
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Gregory Clement <gregory.clement@free-electrons.com>
      Cc: Gerlando Falauto <gerlando.falauto@keymile.com>
      Cc: Rob Landley <rob@landley.net>
      Acked-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
      Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Link: http://lkml.kernel.org/r/20130506142539.227119865@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      966dc736
    • G
      genirq: Generic chip: Handle separate mask registers · af80b0fe
      Gerlando Falauto 提交于
      There are cases where all irq_chip_type instances have separate mask
      registers, making a shared mask register cache unsuitable for the
      purpose.
      
      Introduce a new flag IRQ_GC_MASK_CACHE_PER_TYPE. If set, point the per
      chip mask pointer to the per chip private mask cache instead.
      
      [ tglx: Simplified code, renamed flag and massaged changelog ]
      Signed-off-by: NGerlando Falauto <gerlando.falauto@keymile.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Joey Oravec <joravec@drewtech.com>
      Cc: Lennert Buytenhek <kernel@wantstofly.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Holger Brunck <Holger.Brunck@keymile.com>
      Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
      Acked-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: devicetree-discuss@lists.ozlabs.org
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Ben Dooks <ben-linux@fluff.org>
      Cc: Gregory Clement <gregory.clement@free-electrons.com>
      Cc: Simon Guinot <simon@sequanux.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Jean-Francois Moine <moinejf@free.fr>
      Cc: Nicolas Pitre <nico@fluxnic.net>
      Cc: Rob Landley <rob@landley.net>
      Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
      Link: http://lkml.kernel.org/r/20130506142539.152569748@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      af80b0fe
    • G
      genirq: Generic chip: Add support for per chip type mask cache · 899f0e66
      Gerlando Falauto 提交于
      Today the same interrupt mask cache (stored within struct irq_chip_generic)
      is shared between all the irq_chip_type instances. As there are instances
      where each irq_chip_type uses a distinct mask register (as it is the case
      for Orion SoCs), sharing a single mask cache may be incorrect.
      So add a distinct pointer for each irq_chip_type, which for now
      points to the original mask register within irq_chip_generic.
      So no functional changes here.
      
      [ tglx: Minor cosmetic tweaks ]
      Reported-by: NJoey Oravec <joravec@drewtech.com>
      Signed-off-by: NSimon Guinot <sguinot@lacie.com>
      Signed-off-by: NHolger Brunck <holger.brunck@keymile.com>
      Signed-off-by: NGerlando Falauto <gerlando.falauto@keymile.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Lennert Buytenhek <kernel@wantstofly.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Holger Brunck <Holger.Brunck@keymile.com>
      Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
      Acked-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: devicetree-discuss@lists.ozlabs.org
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Ben Dooks <ben-linux@fluff.org>
      Cc: Gregory Clement <gregory.clement@free-electrons.com>
      Cc: Simon Guinot <simon@sequanux.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Jean-Francois Moine <moinejf@free.fr>
      Cc: Nicolas Pitre <nico@fluxnic.net>
      Cc: Rob Landley <rob@landley.net>
      Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
      Link: http://lkml.kernel.org/r/20130506142539.082226607@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      899f0e66
    • G
      genirq: Generic chip: Remove the local cur_regs() function · cfeaa93f
      Gerlando Falauto 提交于
      Since we already have an irq_data_get_chip_type() function which returns
      a pointer to irq_chip_type, use that instead of cur_regs().
      Signed-off-by: NGerlando Falauto <gerlando.falauto@keymile.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Joey Oravec <joravec@drewtech.com>
      Cc: Lennert Buytenhek <kernel@wantstofly.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Holger Brunck <Holger.Brunck@keymile.com>
      Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
      Acked-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: devicetree-discuss@lists.ozlabs.org
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Ben Dooks <ben-linux@fluff.org>
      Cc: Gregory Clement <gregory.clement@free-electrons.com>
      Cc: Simon Guinot <simon@sequanux.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Jean-Francois Moine <moinejf@free.fr>
      Cc: Nicolas Pitre <nico@fluxnic.net>
      Cc: Rob Landley <rob@landley.net>
      Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
      Link: http://lkml.kernel.org/r/20130506142539.010164766@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      cfeaa93f
  2. 25 5月, 2013 1 次提交
  3. 17 5月, 2013 3 次提交
  4. 16 5月, 2013 7 次提交
    • M
      tracing: Return -EBUSY when event_enable_func() fails to get module · 6ed01066
      Masami Hiramatsu 提交于
      Since try_module_get() returns false( = 0) when it fails to
      pindown a module, event_enable_func() returns 0 which means
      "succeed". This can cause a kernel panic when the entry
      is removed, because the event is already released.
      
      This fixes the bug by returning -EBUSY, because the reason
      why it fails is that the module is being removed at that time.
      
      Link: http://lkml.kernel.org/r/20130516114848.13508.97899.stgit@mhiramat-M0-7522
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tom Zanussi <tom.zanussi@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      6ed01066
    • T
      workqueue: don't perform NUMA-aware allocations on offline nodes in wq_numa_init() · 1be0c25d
      Tejun Heo 提交于
      wq_numa_init() builds per-node cpumasks which are later used to make
      unbound workqueues NUMA-aware.  The cpumasks are allocated using
      alloc_cpumask_var_node() for all possible nodes.  Unfortunately, on
      machines with off-line nodes, this leads to NUMA-aware allocations on
      existing bug offline nodes, which in turn triggers BUG in the memory
      allocation code.
      
      Fix it by using NUMA_NO_NODE for cpumask allocations for offline
      nodes.
      
        kernel BUG at include/linux/gfp.h:323!
        invalid opcode: 0000 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0+ #1
        Hardware name: ProLiant BL465c G7, BIOS A19 12/10/2011
        task: ffff880234608000 ti: ffff880234602000 task.ti: ffff880234602000
        RIP: 0010:[<ffffffff8117495d>]  [<ffffffff8117495d>] new_slab+0x2ad/0x340
        RSP: 0000:ffff880234603bf8  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff880237404b40 RCX: 00000000000000d0
        RDX: 0000000000000001 RSI: 0000000000000003 RDI: 00000000002052d0
        RBP: ffff880234603c28 R08: 0000000000000000 R09: 0000000000000001
        R10: 0000000000000001 R11: ffffffff812e3aa8 R12: 0000000000000001
        R13: ffff8802378161c0 R14: 0000000000030027 R15: 00000000000040d0
        FS:  0000000000000000(0000) GS:ffff880237800000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: ffff88043fdff000 CR3: 00000000018d5000 CR4: 00000000000007f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Stack:
         ffff880234603c28 0000000000000001 00000000000000d0 ffff8802378161c0
         ffff880237404b40 ffff880237404b40 ffff880234603d28 ffffffff815edba1
         ffff880237816140 0000000000000000 ffff88023740e1c0
        Call Trace:
         [<ffffffff815edba1>] __slab_alloc+0x330/0x4f2
         [<ffffffff81174b25>] kmem_cache_alloc_node_trace+0xa5/0x200
         [<ffffffff812e3aa8>] alloc_cpumask_var_node+0x28/0x90
         [<ffffffff81a0bdb3>] wq_numa_init+0x10d/0x1be
         [<ffffffff81a0bec8>] init_workqueues+0x64/0x341
         [<ffffffff810002ea>] do_one_initcall+0xea/0x1a0
         [<ffffffff819f1f31>] kernel_init_freeable+0xb7/0x1ec
         [<ffffffff815d50de>] kernel_init+0xe/0xf0
         [<ffffffff815ff89c>] ret_from_fork+0x7c/0xb0
        Code: 45  84 ac 00 00 00 f0 41 80 4d 00 40 e9 f6 fe ff ff 66 0f 1f 84 00 00 00 00 00 e8 eb 4b ff ff 49 89 c5 e9 05 fe ff ff <0f> 0b 4c 8b 73 38 44 89 ff 81 cf 00 00 20 00 4c 89 f6 48 c1 ee
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-and-Tested-by: NLingzhu Xiang <lxiang@redhat.com>
      1be0c25d
    • M
      tracing/kprobes: Make print_*probe_event static · b62fdd97
      Masami Hiramatsu 提交于
      According to sparse warning, print_*probe_event static because
      those functions are not directly called from outside.
      
      Link: http://lkml.kernel.org/r/20130513115839.6545.83067.stgit@mhiramat-M0-7522
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tom Zanussi <tom.zanussi@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b62fdd97
    • M
      tracing/kprobes: Fix a sparse warning for incorrect type in assignment · 3d1fc7b0
      Masami Hiramatsu 提交于
      Fix a sparse warning about the rcu operated pointer is
      defined without __rcu address space.
      
      Link: http://lkml.kernel.org/r/20130513115837.6545.23322.stgit@mhiramat-M0-7522
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tom Zanussi <tom.zanussi@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3d1fc7b0
    • M
      tracing/kprobes: Use rcu_dereference_raw for tp->files · c02c7e65
      Masami Hiramatsu 提交于
      Use rcu_dereference_raw() for accessing tp->files. Because the
      write-side uses rcu_assign_pointer() for memory barrier,
      the read-side also has to use rcu_dereference_raw() with
      read memory barrier.
      
      Link: http://lkml.kernel.org/r/20130513115834.6545.17022.stgit@mhiramat-M0-7522
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tom Zanussi <tom.zanussi@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c02c7e65
    • S
      tracing: Fix leaks of filter preds · 60705c89
      Steven Rostedt (Red Hat) 提交于
      Special preds are created when folding a series of preds that
      can be done in serial. These are allocated in an ops field of
      the pred structure. But they were never freed, causing memory
      leaks.
      
      This was discovered using the kmemleak checker:
      
      unreferenced object 0xffff8800797fd5e0 (size 32):
        comm "swapper/0", pid 1, jiffies 4294690605 (age 104.608s)
        hex dump (first 32 bytes):
          00 00 01 00 03 00 05 00 07 00 09 00 0b 00 0d 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff814b52af>] kmemleak_alloc+0x73/0x98
          [<ffffffff8111ff84>] kmemleak_alloc_recursive.constprop.42+0x16/0x18
          [<ffffffff81120e68>] __kmalloc+0xd7/0x125
          [<ffffffff810d47eb>] kcalloc.constprop.24+0x2d/0x2f
          [<ffffffff810d4896>] fold_pred_tree_cb+0xa9/0xf4
          [<ffffffff810d3781>] walk_pred_tree+0x47/0xcc
          [<ffffffff810d5030>] replace_preds.isra.20+0x6f8/0x72f
          [<ffffffff810d50b5>] create_filter+0x4e/0x8b
          [<ffffffff81b1c30d>] ftrace_test_event_filter+0x5a/0x155
          [<ffffffff8100028d>] do_one_initcall+0xa0/0x137
          [<ffffffff81afbedf>] kernel_init_freeable+0x14d/0x1dc
          [<ffffffff814b24b7>] kernel_init+0xe/0xdb
          [<ffffffff814d539c>] ret_from_fork+0x7c/0xb0
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: stable@vger.kernel.org # 2.6.39+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      60705c89
    • S
      rcu: Don't allocate bootmem from rcu_init() · 615ee544
      Sasha Levin 提交于
      When rcu_init() is called we already have slab working, allocating
      bootmem at that point results in warnings and an allocation from
      slab.  This commit therefore changes alloc_bootmem_cpumask_var() to
      alloc_cpumask_var() in rcu_bootup_announce_oddness(), which is called
      from rcu_init().
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Tested-by: NRobin Holt <holt@sgi.com>
      
      [paulmck: convert to zalloc_cpumask_var(), as suggested by Yinghai Lu.]
      615ee544
  5. 15 5月, 2013 4 次提交
  6. 14 5月, 2013 3 次提交
    • T
      timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE · 42a5cf46
      Tirupathi Reddy 提交于
      An inactive timer's base can refer to a offline cpu's base.
      
      In the current code, cpu_base's lock is blindly reinitialized each
      time a CPU is brought up. If a CPU is brought online during the period
      that another thread is trying to modify an inactive timer on that CPU
      with holding its timer base lock, then the lock will be reinitialized
      under its feet. This leads to following SPIN_BUG().
      
      <0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466
      <0> lock: 0xe3ebe000, .magic: dead4ead, .owner: kworker/u:3/1466, .owner_cpu: 1
      <4> [<c0013dc4>] (unwind_backtrace+0x0/0x11c) from [<c026e794>] (do_raw_spin_unlock+0x40/0xcc)
      <4> [<c026e794>] (do_raw_spin_unlock+0x40/0xcc) from [<c076c160>] (_raw_spin_unlock+0x8/0x30)
      <4> [<c076c160>] (_raw_spin_unlock+0x8/0x30) from [<c009b858>] (mod_timer+0x294/0x310)
      <4> [<c009b858>] (mod_timer+0x294/0x310) from [<c00a5e04>] (queue_delayed_work_on+0x104/0x120)
      <4> [<c00a5e04>] (queue_delayed_work_on+0x104/0x120) from [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c)
      <4> [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c) from [<c04d8780>] (sdhci_disable+0x40/0x48)
      <4> [<c04d8780>] (sdhci_disable+0x40/0x48) from [<c04bf300>] (mmc_release_host+0x4c/0xb0)
      <4> [<c04bf300>] (mmc_release_host+0x4c/0xb0) from [<c04c7aac>] (mmc_sd_detect+0x90/0xfc)
      <4> [<c04c7aac>] (mmc_sd_detect+0x90/0xfc) from [<c04c2504>] (mmc_rescan+0x7c/0x2c4)
      <4> [<c04c2504>] (mmc_rescan+0x7c/0x2c4) from [<c00a6a7c>] (process_one_work+0x27c/0x484)
      <4> [<c00a6a7c>] (process_one_work+0x27c/0x484) from [<c00a6e94>] (worker_thread+0x210/0x3b0)
      <4> [<c00a6e94>] (worker_thread+0x210/0x3b0) from [<c00aad9c>] (kthread+0x80/0x8c)
      <4> [<c00aad9c>] (kthread+0x80/0x8c) from [<c000ea80>] (kernel_thread_exit+0x0/0x8)
      
      As an example, this particular crash occurred when CPU #3 is executing
      mod_timer() on an inactive timer whose base is refered to offlined CPU
      #2.  The code locked the timer_base corresponding to CPU #2. Before it
      could proceed, CPU #2 came online and reinitialized the spinlock
      corresponding to its base. Thus now CPU #3 held a lock which was
      reinitialized. When CPU #3 finally ended up unlocking the old cpu_base
      corresponding to CPU #2, we hit the above SPIN_BUG().
      
      CPU #0		CPU #3				       CPU #2
      ------		-------				       -------
      .....		 ......				      <Offline>
      		mod_timer()
      		 lock_timer_base
      		   spin_lock_irqsave(&base->lock)
      
      cpu_up(2)	 .....				        ......
      							init_timers_cpu()
      ....		 .....				    	spin_lock_init(&base->lock)
      .....		   spin_unlock_irqrestore(&base->lock)  ......
      		   <spin_bug>
      
      Allocation of per_cpu timer vector bases is done only once under
      "tvec_base_done[]" check. In the current code, spinlock_initialization
      of base->lock isn't under this check. When a CPU is up each time the
      base lock is reinitialized. Move base spinlock initialization under
      the check.
      Signed-off-by: NTirupathi Reddy <tirupath@codeaurora.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1368520142-4136-1-git-send-email-tirupath@codeaurora.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      42a5cf46
    • S
      rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit · b47430d3
      Srivatsa S. Bhat 提交于
      Bjørn Mork reported the following warning when running powertop.
      
      [   49.289034] ------------[ cut here ]------------
      [   49.289055] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common.isra.48+0x3d/0x125()
      [   49.289244] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-bisect-rcu-warn+ #107
      [   49.289251]  ffffffff8157d8c8 ffffffff81801e28 ffffffff8137e4e3 ffffffff81801e68
      [   49.289260]  ffffffff8103094f ffffffff81801e68 0000000000000000 ffff88023afcd9b0
      [   49.289268]  0000000000000000 0140000000000000 ffff88023bee7700 ffffffff81801e78
      [   49.289276] Call Trace:
      [   49.289285]  [<ffffffff8137e4e3>] dump_stack+0x19/0x1b
      [   49.289293]  [<ffffffff8103094f>] warn_slowpath_common+0x62/0x7b
      [   49.289300]  [<ffffffff8103097d>] warn_slowpath_null+0x15/0x17
      [   49.289306]  [<ffffffff810a9006>] rcu_eqs_exit_common.isra.48+0x3d/0x125
      [   49.289314]  [<ffffffff81079b49>] ? trace_hardirqs_off_caller+0x37/0xa6
      [   49.289320]  [<ffffffff810a9692>] rcu_idle_exit+0x85/0xa8
      [   49.289327]  [<ffffffff8107076e>] trace_cpu_idle_rcuidle+0xae/0xff
      [   49.289334]  [<ffffffff810708b1>] cpu_startup_entry+0x72/0x115
      [   49.289341]  [<ffffffff813689e5>] rest_init+0x149/0x150
      [   49.289347]  [<ffffffff8136889c>] ? csum_partial_copy_generic+0x16c/0x16c
      [   49.289355]  [<ffffffff81a82d34>] start_kernel+0x3f0/0x3fd
      [   49.289362]  [<ffffffff81a8274c>] ? repair_env_string+0x5a/0x5a
      [   49.289368]  [<ffffffff81a82481>] x86_64_start_reservations+0x2a/0x2c
      [   49.289375]  [<ffffffff81a82550>] x86_64_start_kernel+0xcd/0xd1
      [   49.289379] ---[ end trace 07a1cc95e29e9036 ]---
      
      The warning is that 'rdtp->dynticks' has an unexpected value, which roughly
      translates to - the calls to rcu_idle_enter() and rcu_idle_exit() were not
      made in the correct order, or otherwise messed up.
      
      And Bjørn's painstaking debugging indicated that this happens when the idle
      loop enters the poll mode. Looking at the poll function cpu_idle_poll(), and
      the implementation of trace_cpu_idle_rcuidle(), the problem becomes very clear:
      cpu_idle_poll() lacks calls to rcu_idle_enter/exit(), and trace_cpu_idle_rcuidle()
      calls them in the reverse order - first rcu_idle_exit(), and then rcu_idle_enter().
      Hence the even/odd alternative sequencing of rdtp->dynticks goes for a toss.
      
      And powertop readily triggers this because powertop uses the idle-tracing
      infrastructure extensively.
      
      So, to fix this, wrap the code in cpu_idle_poll() within rcu_idle_enter/exit(),
      so that it blends properly with the calls inside trace_cpu_idle_rcuidle() and
      thus get the function ordering right.
      Reported-and-tested-by: NBjørn Mork <bjorn@mork.no>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/519169BF.4080208@linux.vnet.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      b47430d3
    • T
      tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline · f7ea0fd6
      Thomas Gleixner 提交于
      commit 5b39939a (nohz: Move ts->idle_calls incrementation into strict
      idle logic) moved code out of tick_nohz_stop_sched_tick() and missed
      to bail out when the cpu is offline. That's causing subsequent
      failures as an offline CPU is supposed to die and not to fiddle with
      nohz magic.
      
      Return false in can_stop_idle_tick() if the cpu is offline.
      Reported-and-tested-by: NJiri Kosina <jkosina@suse.cz>
      Reported-and-tested-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305132138160.2863@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f7ea0fd6
  7. 12 5月, 2013 1 次提交
  8. 11 5月, 2013 1 次提交
    • T
      workqueue: workqueue_congested() shouldn't translate WORK_CPU_UNBOUND into node number · d3251859
      Tejun Heo 提交于
      df2d5ae4 ("workqueue: map an unbound workqueues to multiple per-node
      pool_workqueues") made unbound workqueues to map to multiple per-node
      pool_workqueues and accordingly updated workqueue_contested() so that,
      for unbound workqueues, it maps the specified @cpu to the NUMA node
      number to obtain the matching pool_workqueue to query the congested
      state.
      
      Before this change, workqueue_congested() ignored @cpu for unbound
      workqueues as there was only one pool_workqueue and some users
      (fscache) called it with WORK_CPU_UNBOUND.  After the commit, this
      causes the following oops as WORK_CPU_UNBOUND gets translated to
      garbage by cpu_to_node().
      
        BUG: unable to handle kernel paging request at ffff8803598d98b8
        IP: [<ffffffff81043b7e>] unbound_pwq_by_node+0xa1/0xfa
        PGD 2421067 PUD 0
        Oops: 0000 [#1] SMP
        CPU: 1 PID: 2689 Comm: cat Tainted: GF            3.9.0-fsdevel+ #4
        task: ffff88003d801040 ti: ffff880025806000 task.ti: ffff880025806000
        RIP: 0010:[<ffffffff81043b7e>]  [<ffffffff81043b7e>] unbound_pwq_by_node+0xa1/0xfa
        RSP: 0018:ffff880025807ad8  EFLAGS: 00010202
        RAX: 0000000000000001 RBX: ffff8800388a2400 RCX: 0000000000000003
        RDX: ffff880025807fd8 RSI: ffffffff81a31420 RDI: ffff88003d8016e0
        RBP: ffff880025807ae8 R08: ffff88003d801730 R09: ffffffffa00b4898
        R10: ffffffff81044217 R11: ffff88003d801040 R12: 0000000064206e97
        R13: ffff880036059d98 R14: ffff880038cc8080 R15: ffff880038cc82d0
        FS:  00007f21afd9c740(0000) GS:ffff88003d100000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: ffff8803598d98b8 CR3: 000000003df49000 CR4: 00000000000007e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Stack:
         ffff8800388a2400 0000000000000002 ffff880025807b18 ffffffff810442ce
         ffffffff81044217 ffff880000000002 ffff8800371b4080 ffff88003d112ec0
         ffff880025807b38 ffffffffa00810b0 ffff880036059d88 ffff880036059be8
        Call Trace:
         [<ffffffff810442ce>] workqueue_congested+0xb7/0x12c
         [<ffffffffa00810b0>] fscache_enqueue_object+0xb2/0xe8 [fscache]
         [<ffffffffa007facd>] __fscache_acquire_cookie+0x3b9/0x56c [fscache]
         [<ffffffffa00ad8fe>] nfs_fscache_set_inode_cookie+0xee/0x132 [nfs]
         [<ffffffffa009e112>] do_open+0x9/0xd [nfs]
         [<ffffffff810e804a>] do_dentry_open+0x175/0x24b
         [<ffffffff810e8298>] finish_open+0x41/0x51
      
      Fix it by using smp_processor_id() if @cpu is WORK_CPU_UNBOUND.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NDavid Howells <dhowells@redhat.com>
      Tested-and-Acked-by: NDavid Howells <dhowells@redhat.com>
      d3251859
  9. 10 5月, 2013 15 次提交