1. 16 2月, 2012 2 次提交
  2. 15 2月, 2012 2 次提交
    • G
      irq_domain/powerpc: eliminate irq_map; use irq_alloc_desc() instead · 4bbdd45a
      Grant Likely 提交于
      This patch drops the powerpc-specific irq_map table and replaces it with
      directly using the irq_alloc_desc()/irq_free_desc() interfaces for allocating
      and freeing irq_desc structures.
      
      This patch is a preparation step for generalizing the powerpc-specific virq
      infrastructure to become irq_domains.
      
      As part of this change, the irq_big_lock is changed to a mutex from a raw
      spinlock.  There is no longer any need to use a spin lock since the irq_desc
      allocation code is now responsible for the critical section of finding
      an unused range of irq numbers.
      
      The radix lookup table is also changed to store the irq_data pointer instead
      of the irq_map entry since the irq_map is removed.  This should end up being
      functionally equivalent since only allocated irq_descs are ever added to the
      radix tree.
      
      v5: - Really don't ever allocate virq 0.  The previous version could still
            do it if hint == 0
          - Respect irq_virq_count setting for NOMAP.  Some NOMAP domains cannot
            use virq values above irq_virq_count.
          - Use numa_node_id() when allocating irq_descs.  Ideally the API should
            obtain that value from the caller, but that touches a lot of call sites
            so will be deferred to a follow-on patch.
          - Fix irq_find_mapping() to include irq numbers lower than
            NUM_ISA_INTERRUPTS.  With the switch to irq_alloc_desc*(), the lowest
            possible allocated irq is now returned by arch_probe_nr_irqs().
      v4: - Fix incorrect access to irq_data structure in debugfs code
          - Don't ever allocate virq 0
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Milton Miller <miltonm@bga.com>
      Tested-by: NOlof Johansson <olof@lixom.net>
      4bbdd45a
    • G
      irq_domain/powerpc: Use common irq_domain structure instead of irq_host · bae1d8f1
      Grant Likely 提交于
      This patch drops the powerpc-specific irq_host structures and uses the common
      irq_domain strucutres defined in linux/irqdomain.h.  It also fixes all
      the users to use the new structure names.
      
      Renaming irq_host to irq_domain has been discussed for a long time, and this
      patch is a step in the process of generalizing the powerpc virq code to be
      usable by all architecture.
      
      An astute reader will notice that this patch actually removes the irq_host
      structure instead of renaming it.  This is because the irq_domain structure
      already exists in include/linux/irqdomain.h and has the needed data members.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Milton Miller <miltonm@bga.com>
      Tested-by: NOlof Johansson <olof@lixom.net>
      bae1d8f1
  3. 14 2月, 2012 1 次提交
  4. 08 12月, 2011 1 次提交
  5. 25 11月, 2011 2 次提交
    • A
      powerpc/time: Optimise decrementer_check_overflow · 7df10275
      Anton Blanchard 提交于
      decrementer_check_overflow is called from arch_local_irq_restore so
      we want to make it as light weight as possible. As such, turn
      decrementer_check_overflow into an inline function.
      
      To avoid a circular mess of includes, separate out the two components
      of struct decrementer_clock and keep the struct clock_event_device
      part local to time.c.
      
      The fast path improves from:
      
      arch_local_irq_restore
           0:       mflr    r0
           4:       std     r0,16(r1)
           8:       stdu    r1,-112(r1)
           c:       stb     r3,578(r13)
          10:       cmpdi   cr7,r3,0
          14:       beq-    cr7,24 <.arch_local_irq_restore+0x24>
      ...
          24:       addi    r1,r1,112
          28:       ld      r0,16(r1)
          2c:       mtlr    r0
          30:       blr
      
      to:
      
      arch_local_irq_restore
          0:       std     r30,-16(r1)
          4:       ld      r30,0(r2)
          8:       stb     r3,578(r13)
          c:       cmpdi   cr7,r3,0
         10:       beq-    cr7,6c <.arch_local_irq_restore+0x6c>
      ...
         6c:       ld      r30,-16(r1)
         70:       blr
      
      Unfortunately we still setup a local TOC (due to -mminimal-toc). Yet
      another sign we should be moving to -mcmodel=medium.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7df10275
    • A
      powerpc/time: Handle wrapping of decrementer · 37fb9a02
      Anton Blanchard 提交于
      When re-enabling interrupts we have code to handle edge sensitive
      decrementers by resetting the decrementer to 1 whenever it is negative.
      If interrupts were disabled long enough that the decrementer wrapped to
      positive we do nothing. This means interrupts can be delayed for a long
      time until it finally goes negative again.
      
      While we hope interrupts are never be disabled long enough for the
      decrementer to go positive, we have a very good test team that can
      drive any kernel into the ground. The softlockup data we get back
      from these fails could be seconds in the future, completely missing
      the cause of the lockup.
      
      We already keep track of the timebase of the next event so use that
      to work out if we should trigger a decrementer exception.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@kernel.org
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      37fb9a02
  6. 01 11月, 2011 1 次提交
  7. 22 7月, 2011 1 次提交
  8. 19 7月, 2011 1 次提交
  9. 29 6月, 2011 1 次提交
  10. 23 6月, 2011 1 次提交
  11. 26 5月, 2011 5 次提交
    • M
      powerpc: Fix irq_free_virt by adjusting bounds before loop · 4dd60290
      Milton Miller 提交于
      Instead of looping over each irq and checking against the irq array
      bounds, adjust the bounds before looping.
      
      The old code will not free any irq if the irq + count is above
      irq_virq_count because the test in the loop is testing irq + count
      instead of irq + i.
      
      This code checks the limits to avoid unsigned integer overflows.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4dd60290
    • M
      powerpc/irq: Protect irq_radix_revmap_lookup against irq_free_virt · 9b788251
      Milton Miller 提交于
      The radix-tree code uses call_rcu when freeing internal elements.
      We must protect against the elements being freed while we traverse
      the tree, even if the returned pointer will still be valid.
      
      While preparing a patch to expand the context in which
      irq_radix_revmap_lookup will be called, I realized that the
      radix tree was not locked.
      
      When asked
      
          For a normal call_rcu usage, is it allowed to read the structure in
          irq_enter / irq_exit, without additional rcu_read_lock?  Could an
          element freed with call_rcu advance with the cpu still between
          irq_enter/irq_exit (and irq_disabled())?
      
      Paul McKenney replied:
      
          Absolutely illegal to do so. OK for call_rcu_sched(), but a
          flaming bug for call_rcu().
      
          And thank you very much for finding this!!!
      
      Further analysis:
      
      In the current CONFIG_TREE_RCU implementation. CONFIG_TREE_PREEMPT_RCU
      (and CONFIG_TINY_PREEMPT_RCU) uses explicit counters.
      
      These counters are reflected from per-CPU to global in the
      scheduling-clock-interrupt handler, so disabling irq does prevent the
      grace period from completing. But there are real-time implementations
      (such as the one use by the Concurrent guys) where disabling irq
      does -not- prevent the grace period from completing.
      
      While an alternative fix would be to switch radix-tree to rcu_sched, I
      don't want to audit the other users of radix trees (nor put alternative
      freeing in the library).  The normal overhead for rcu_read_lock and
      unlock are a local counter increment and decrement.
      
      This does not show up in the rcu lockdep because in 2.6.34 commit
      2676a58c (radix-tree: Disable RCU lockdep checking in radix tree)
      deemed it too hard to pass the condition of the protecting lock
      to the library.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9b788251
    • M
      powerpc/irq: Check desc in handle_one_irq and expand generic_handle_irq · 2e455257
      Milton Miller 提交于
      Look up the descriptor and check that it is found in handle_one_irq
      before checking if we are on the irq stack, and call the handler
      directly using the descriptor if we are on the stack.
      
      We need check irq_to_desc finds the descriptor to avoid a NULL
      pointer dereference.  It could have failed because the number from
      ppc_md.get_irq was above NR_IRQS, or various exceptional conditions
      with sparse irqs (eg race conditions while freeing an irq if its was
      not shutdown in the controller).
      
      fe12bc2c (genirq: Uninline and sanity check generic_handle_irq())
      moved generic_handle_irq out of line to allow its use by interrupt
      controllers in modules.  However, handle_one_irq is core arch code.
      It already knows the details of struct irq_desc and handling irqs in
      the nested irq case.  This will avoid the extra stack frame to return
      the value we don't check.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2e455257
    • M
      powerpc/irq: Always free duplicate IRQ_LEGACY hosts · 3d1b5e20
      Milton Miller 提交于
      Since kmem caches are allocated before init_IRQ as noted in 3af259d1
      (powerpc: Radix trees are available before init_IRQ), we now call
      kmalloc in all cases and can can always call kfree if we are asked
      to allocate a duplicate or conflicting IRQ_HOST_MAP_LEGACY host.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3d1b5e20
    • M
      powerpc/irq: Remove stale and misleading comment · 8142f032
      Milton Miller 提交于
      The comment claims we will call host->ops->map() to update the flags if
      we find a previously established mapping, but we never did.  We used
      to call remap, but that call was removed in da051980 (powerpc: Remove
      irq_host_ops->remap hook).
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8142f032
  12. 19 5月, 2011 7 次提交
    • M
      powerpc: Make IRQ_NOREQUEST last to clear, first to set · 41fb5e62
      Milton Miller 提交于
      When creating an irq, don't allow a concurent driver request until
      we have caled map, which will likley call set_chip_and_handler to
      change the irq_chip and its operations.
      
      Similarly, when tearing down an IRQ, make sure no new uses come
      along while we change the irq back to the nop chip and then reset
      the descriptor to freed status.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      41fb5e62
    • M
      powerpc: Remove virq_to_host · 1e8c2301
      Milton Miller 提交于
      The only references to the irq_map[].host field are internal to
      arch/powerpc/kernel/irq.c
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Acked-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1e8c2301
    • M
      powerpc: Add virq_is_host to reduce virq_to_host usage · 3ee62d36
      Milton Miller 提交于
      Some irq_host implementations are using virq_to_host to check if
      they are the irq_host for a virtual irq.  To allow us to make space
      versus time tradeoffs, replace this usage with an assertive
      virq_is_host that confirms or denies the irq is associated with the
      given irq_host.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Acked-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3ee62d36
    • M
      powerpc: Remove irq_host_ops->remap hook · da051980
      Milton Miller 提交于
      It was called from irq_create_mapping if that was called for a host
      and hwirq that was previously mapped, "to update the flags".  But the
      only implementation was in beat_interrupt and all it did was repeat a
      hypervisor call without error checking that was performed with error
      checking at the beginning of the map hook.  In addition, the comment on
      the beat remap hook says it will only called once for a given mapping,
      which would apply to map not remap.
      
      All flags should be known by the time the match hook is called, before
      we call the map hook.  Removing this mostly unused hook will simpify
      the requirements of irq_domain concept.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      da051980
    • M
      powerpc: Return early if irq_host lookup type is wrong · 2d441681
      Milton Miller 提交于
      If for some reason the code incrorectly calls the wrong function to
      manage the revmap, not only should we warn, we should take action.
      However, in the paths we expect to be taken every delivered interrupt
      change to WARN_ON_ONCE.  Use the if (WARN_ON(x)) format to get the
      unlikely for free.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Reviewed-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2d441681
    • M
      powerpc: Radix trees are available before init_IRQ · 3af259d1
      Milton Miller 提交于
      Since the generic irq code uses a radix tree for sparse interrupts,
      the initcall ordering has been changed to initialize radix trees before
      irqs.   We no longer need to defer creating revmap radix trees to the
      arch_initcall irq_late_init.
      
      Also, the kmem caches are allocated so we don't need to use
      zalloc_maybe_bootmem.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Reviewed-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3af259d1
    • M
      powerpc: Consolidate ipi message mux and demux · 23d72bfd
      Milton Miller 提交于
      Consolidate the mux and demux of ipi messages into smp.c and call
      a new smp_ops callback to actually trigger the ipi.
      
      The powerpc architecture code is optimised for having 4 distinct
      ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
      single, scheduler ipi, and enter debugger).  However, several interrupt
      controllers only provide a single software triggered interrupt that
      can be delivered to each cpu.  To resolve this limitation, each smp_ops
      implementation created a per-cpu variable that is manipulated with atomic
      bitops.  Since these lines will be contended they are optimialy marked as
      shared_aligned and take a full cache line for each cpu.  Distro kernels
      may have 2 or 3 of these in their config, each taking per-cpu space
      even though at most one will be in use.
      
      This consolidation removes smp_message_recv and replaces the single call
      actions cases with direct calls from the common message recognition loop.
      The complicated debugger ipi case with its muxed crash handling code is
      moved to debug_ipi_action which is now called from the demux code (instead
      of the multi-message action calling smp_message_recv).
      
      I put a call to reschedule_action to increase the likelyhood of correctly
      merging the anticipated scheduler_ipi() hook coming from the scheduler
      tree; that single required call can be inlined later.
      
      The actual message decode is a copy of the old pseries xics code with its
      memory barriers and cache line spacing, augmented with a per-cpu unsigned
      long based on the book-e doorbell code.  The optional data is set via a
      callback from the implementation and is passed to the new cause-ipi hook
      along with the logical cpu number.  While currently only the doorbell
      implemntation uses this data it should be almost zero cost to retrieve and
      pass it -- it adds a single register load for the argument from the same
      cache line to which we just completed a store and the register is dead
      on return from the call.  I extended the data element from unsigned int
      to unsigned long in case some other code wanted to associate a pointer.
      
      The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
      conditioned on the CPU_DBELL feature.  The ifdef guard could be relaxed
      to CONFIG_SMP but I left it with BOOKE for now.
      
      Also, the doorbell interrupt vector for book-e was not calling irq_enter
      and irq_exit, which throws off cpu accounting and causes code to not
      realize it is running in interrupt context.  Add the missing calls.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      23d72bfd
  13. 04 5月, 2011 1 次提交
  14. 27 4月, 2011 2 次提交
  15. 01 4月, 2011 1 次提交
  16. 29 3月, 2011 4 次提交
  17. 10 3月, 2011 1 次提交
  18. 02 3月, 2011 2 次提交
  19. 13 10月, 2010 2 次提交
  20. 07 10月, 2010 1 次提交
    • D
      Fix IRQ flag handling naming · df9ee292
      David Howells 提交于
      Fix the IRQ flag handling naming.  In linux/irqflags.h under one configuration,
      it maps:
      
      	local_irq_enable() -> raw_local_irq_enable()
      	local_irq_disable() -> raw_local_irq_disable()
      	local_irq_save() -> raw_local_irq_save()
      	...
      
      and under the other configuration, it maps:
      
      	raw_local_irq_enable() -> local_irq_enable()
      	raw_local_irq_disable() -> local_irq_disable()
      	raw_local_irq_save() -> local_irq_save()
      	...
      
      This is quite confusing.  There should be one set of names expected of the
      arch, and this should be wrapped to give another set of names that are expected
      by users of this facility.
      
      Change this to have the arch provide:
      
      	flags = arch_local_save_flags()
      	flags = arch_local_irq_save()
      	arch_local_irq_restore(flags)
      	arch_local_irq_disable()
      	arch_local_irq_enable()
      	arch_irqs_disabled_flags(flags)
      	arch_irqs_disabled()
      	arch_safe_halt()
      
      Then linux/irqflags.h wraps these to provide:
      
      	raw_local_save_flags(flags)
      	raw_local_irq_save(flags)
      	raw_local_irq_restore(flags)
      	raw_local_irq_disable()
      	raw_local_irq_enable()
      	raw_irqs_disabled_flags(flags)
      	raw_irqs_disabled()
      	raw_safe_halt()
      
      with type checking on the flags 'arguments', and then wraps those to provide:
      
      	local_save_flags(flags)
      	local_irq_save(flags)
      	local_irq_restore(flags)
      	local_irq_disable()
      	local_irq_enable()
      	irqs_disabled_flags(flags)
      	irqs_disabled()
      	safe_halt()
      
      with tracing included if enabled.
      
      The arch functions can now all be inline functions rather than some of them
      having to be macros.
      
      Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300]
      Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile]
      Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze]
      Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM]
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR]
      Acked-by: Tony Luck <tony.luck@intel.com> [IA-64]
      Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R]
      Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS]
      Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC]
      Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC]
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390]
      Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score]
      Acked-by: Matt Fleming <matt@console-pimps.org> [SH]
      Acked-by: David S. Miller <davem@davemloft.net> [Sparc]
      Acked-by: Chris Zankel <chris@zankel.net> [Xtensa]
      Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha]
      Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300]
      Cc: starvik@axis.com [CRIS]
      Cc: jesper.nilsson@axis.com [CRIS]
      Cc: linux-cris-kernel@axis.com
      df9ee292
  21. 23 8月, 2010 1 次提交