1. 30 4月, 2007 1 次提交
  2. 05 3月, 2007 1 次提交
    • E
      [PATCH] msi: sanely support hardware level msi disabling · f5f2b131
      Eric W. Biederman 提交于
      In some cases when we are not using msi we need a way to ensure that the
      hardware does not have an msi capability enabled.  Currently the code has been
      calling disable_msi_mode to try and achieve that.  However disable_msi_mode
      has several other side effects and is only available when msi support is
      compiled in so it isn't really appropriate.
      
      Instead this patch implements pci_msi_off which disables all msi and msix
      capabilities unconditionally with no additional side effects.
      
      pci_disable_device was redundantly clearing the bus master enable flag and
      clearing the msi enable bit.  A device that is not allowed to perform bus
      mastering operations cannot generate intx or msi interrupt messages as those
      are essentially a special case of dma, and require bus mastering.  So the call
      in pci_disable_device to disable msi capabilities was redundant.
      
      quirk_pcie_pxh also called disable_msi_mode and is updated to use pci_msi_off.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f5f2b131
  3. 08 2月, 2007 1 次提交
  4. 24 1月, 2007 1 次提交
  5. 04 12月, 2006 2 次提交
  6. 25 10月, 2006 1 次提交
  7. 16 10月, 2006 1 次提交
    • P
      [POWERPC] Lazy interrupt disabling for 64-bit machines · d04c56f7
      Paul Mackerras 提交于
      This implements a lazy strategy for disabling interrupts.  This means
      that local_irq_disable() et al. just clear the 'interrupts are
      enabled' flag in the paca.  If an interrupt comes along, the interrupt
      entry code notices that interrupts are supposed to be disabled, and
      clears the EE bit in SRR1, clears the 'interrupts are hard-enabled'
      flag in the paca, and returns.  This means that interrupts only
      actually get disabled in the processor when an interrupt comes along.
      
      When interrupts are enabled by local_irq_enable() et al., the code
      sets the interrupts-enabled flag in the paca, and then checks whether
      interrupts got hard-disabled.  If so, it also sets the EE bit in the
      MSR to hard-enable the interrupts.
      
      This has the potential to improve performance, and also makes it
      easier to make a kernel that can boot on iSeries and on other 64-bit
      machines, since this lazy-disable strategy is very similar to the
      soft-disable strategy that iSeries already uses.
      
      This version renames paca->proc_enabled to paca->soft_enabled, and
      changes a couple of soft-disables in the kexec code to hard-disables,
      which should fix the crash that Michael Ellerman saw.  This doesn't
      yet use a reserved CR field for the soft_enabled and hard_enabled
      flags.  This applies on top of Stephen Rothwell's patches to make it
      possible to build a combined iSeries/other kernel.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d04c56f7
  8. 10 10月, 2006 1 次提交
  9. 07 10月, 2006 1 次提交
  10. 05 10月, 2006 1 次提交
    • D
      IRQ: Maintain regs pointer globally rather than passing to IRQ handlers · 7d12e780
      David Howells 提交于
      Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
      of passing regs around manually through all ~1800 interrupt handlers in the
      Linux kernel.
      
      The regs pointer is used in few places, but it potentially costs both stack
      space and code to pass it around.  On the FRV arch, removing the regs parameter
      from all the genirq function results in a 20% speed up of the IRQ exit path
      (ie: from leaving timer_interrupt() to leaving do_IRQ()).
      
      Where appropriate, an arch may override the generic storage facility and do
      something different with the variable.  On FRV, for instance, the address is
      maintained in GR28 at all times inside the kernel as part of general exception
      handling.
      
      Having looked over the code, it appears that the parameter may be handed down
      through up to twenty or so layers of functions.  Consider a USB character
      device attached to a USB hub, attached to a USB controller that posts its
      interrupts through a cascaded auxiliary interrupt controller.  A character
      device driver may want to pass regs to the sysrq handler through the input
      layer which adds another few layers of parameter passing.
      
      I've build this code with allyesconfig for x86_64 and i386.  I've runtested the
      main part of the code on FRV and i386, though I can't test most of the drivers.
      I've also done partial conversion for powerpc and MIPS - these at least compile
      with minimal configurations.
      
      This will affect all archs.  Mostly the changes should be relatively easy.
      Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
      
      	struct pt_regs *old_regs = set_irq_regs(regs);
      
      And put the old one back at the end:
      
      	set_irq_regs(old_regs);
      
      Don't pass regs through to generic_handle_irq() or __do_IRQ().
      
      In timer_interrupt(), this sort of change will be necessary:
      
      	-	update_process_times(user_mode(regs));
      	-	profile_tick(CPU_PROFILING, regs);
      	+	update_process_times(user_mode(get_irq_regs()));
      	+	profile_tick(CPU_PROFILING);
      
      I'd like to move update_process_times()'s use of get_irq_regs() into itself,
      except that i386, alone of the archs, uses something other than user_mode().
      
      Some notes on the interrupt handling in the drivers:
      
       (*) input_dev() is now gone entirely.  The regs pointer is no longer stored in
           the input_dev struct.
      
       (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking.  It does
           something different depending on whether it's been supplied with a regs
           pointer or not.
      
       (*) Various IRQ handler function pointers have been moved to type
           irq_handler_t.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
      7d12e780
  11. 26 9月, 2006 1 次提交
  12. 30 8月, 2006 1 次提交
  13. 17 8月, 2006 1 次提交
  14. 08 8月, 2006 1 次提交
  15. 11 7月, 2006 1 次提交
    • B
      [PATCH] powerpc: fix trigger handling in the new irq code · 6e99e458
      Benjamin Herrenschmidt 提交于
      This patch slightly reworks the new irq code to fix a small design error.  I
      removed the passing of the trigger to the map() calls entirely, it was not a
      good idea to have one call do two different things.  It also fixes a couple of
      corner cases.
      
      Mapping a linux virtual irq to a physical irq now does only that.  Setting the
      trigger is a different action which has a different call.
      
      The main changes are:
      
      - I no longer call host->ops->map() for an already mapped irq, I just return
        the virtual number that was already mapped.  It was called before to give an
        opportunity to change the trigger, but that was causing issues as that could
        happen while the interrupt was in use by a device, and because of the
        trigger change, map would potentially muck around with things in a racy way.
         That was causing much burden on a given's controller implementation of
        map() to get it right.  This is much simpler now.  map() is only called on
        the initial mapping of an irq, meaning that you know that this irq is _not_
        being used.  You can initialize the hardware if you want (though you don't
        have to).
      
      - Controllers that can handle different type of triggers (level/edge/etc...)
        now implement the standard irq_chip->set_type() call as defined by the
        generic code.  That means that you can use the standard set_irq_type() to
        configure an irq line manually if you wish or (though I don't like that
        interface), pass explicit trigger flags to request_irq() as defined by the
        generic kernel interfaces.  Also, using those interfaces guarantees that
        your controller set_type callback is called with the descriptor lock held,
        thus providing locking against activity on the same interrupt (including
        mask/unmask/etc...) automatically.  A result is that, for example, MPIC's
        own map() implementation calls irq_set_type(NONE) to configure the hardware
        to the default triggers.
      
      - To allow the above, the irq_map array entry for the new mapped interrupt
        is now set before map() callback is called for the controller.
      
      - The irq_create_of_mapping() (also used by irq_of_parse_and_map()) function
        for mapping interrupts from the device-tree now also call the separate
        set_irq_type(), and only does so if there is a change in the trigger type.
      
      - While I was at it, I changed pci_read_irq_line() (which is the helper I
        would expect most archs to use in their pcibios_fixup() to get the PCI
        interrupt routing from the device tree) to also handle a fallback when the
        DT mapping fails consisting of reading the PCI_INTERRUPT_PIN to know wether
        the device has an interrupt at all, and the the PCI_INTERRUPT_LINE to get an
        interrupt number from the device.  That number is then mapped using the
        default controller, and the trigger is set to level low.  That default
        behaviour works for several platforms that don't have a proper interrupt
        tree like Pegasos.  If it doesn't work for your platform, then either
        provide a proper interrupt tree from the firmware so that fallback isn't
        needed, or don't call pci_read_irq_line()
      
      - Add back a bit that got dropped by my main rework patch for properly
        clearing pending IPIs on pSeries when using a kexec
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6e99e458
  16. 04 7月, 2006 2 次提交
  17. 03 7月, 2006 2 次提交
    • B
      [POWERPC] Add new interrupt mapping core and change platforms to use it · 0ebfff14
      Benjamin Herrenschmidt 提交于
      This adds the new irq remapper core and removes the old one.  Because
      there are some fundamental conflicts with the old code, like the value
      of NO_IRQ which I'm now setting to 0 (as per discussions with Linus),
      etc..., this commit also changes the relevant platform and driver code
      over to use the new remapper (so as not to cause difficulties later
      in bisecting).
      
      This patch removes the old pre-parsing of the open firmware interrupt
      tree along with all the bogus assumptions it made to try to renumber
      interrupts according to the platform. This is all to be handled by the
      new code now.
      
      For the pSeries XICS interrupt controller, a single remapper host is
      created for the whole machine regardless of how many interrupt
      presentation and source controllers are found, and it's set to match
      any device node that isn't a 8259.  That works fine on pSeries and
      avoids having to deal with some of the complexities of split source
      controllers vs. presentation controllers in the pSeries device trees.
      
      The powerpc i8259 PIC driver now always requests the legacy interrupt
      range. It also has the feature of being able to match any device node
      (including NULL) if passed no device node as an input. That will help
      porting over platforms with broken device-trees like Pegasos who don't
      have a proper interrupt tree.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      0ebfff14
    • B
      [POWERPC] Use the genirq framework · b9e5b4e6
      Benjamin Herrenschmidt 提交于
      This adapts the generic powerpc interrupt handling code, and all of
      the platforms except for the embedded 6xx machines, to use the new
      genirq framework.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b9e5b4e6
  18. 01 7月, 2006 1 次提交
  19. 30 6月, 2006 2 次提交
    • I
      [PATCH] genirq: cleanup: merge irq_affinity[] into irq_desc[] · a53da52f
      Ingo Molnar 提交于
      Consolidation: remove the irq_affinity[NR_IRQS] array and move it into the
      irq_desc[NR_IRQS].affinity field.
      
      [akpm@osdl.org: sparc64 build fix]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a53da52f
    • I
      [PATCH] genirq: rename desc->handler to desc->chip · d1bef4ed
      Ingo Molnar 提交于
      This patch-queue improves the generic IRQ layer to be truly generic, by adding
      various abstractions and features to it, without impacting existing
      functionality.
      
      While the queue can be best described as "fix and improve everything in the
      generic IRQ layer that we could think of", and thus it consists of many
      smaller features and lots of cleanups, the one feature that stands out most is
      the new 'irq chip' abstraction.
      
      The irq-chip abstraction is about describing and coding and IRQ controller
      driver by mapping its raw hardware capabilities [and quirks, if needed] in a
      straightforward way, without having to think about "IRQ flow"
      (level/edge/etc.) type of details.
      
      This stands in contrast with the current 'irq-type' model of genirq
      architectures, which 'mixes' raw hardware capabilities with 'flow' details.
      The patchset supports both types of irq controller designs at once, and
      converts i386 and x86_64 to the new irq-chip design.
      
      As a bonus side-effect of the irq-chip approach, chained interrupt controllers
      (master/slave PIC constructs, etc.) are now supported by design as well.
      
      The end result of this patchset intends to be simpler architecture-level code
      and more consolidation between architectures.
      
      We reused many bits of code and many concepts from Russell King's ARM IRQ
      layer, the merging of which was one of the motivations for this patchset.
      
      This patch:
      
      rename desc->handler to desc->chip.
      
      Originally i did not want to do this, because it's a big patch.  But having
      both "desc->handler", "desc->handle_irq" and "action->handler" caused a
      large degree of confusion and made the code appear alot less clean than it
      truly is.
      
      I have also attempted a dual approach as well by introducing a
      desc->chip alias - but that just wasnt robust enough and broke
      frequently.
      
      So lets get over with this quickly.  The conversion was done automatically
      via scripts and converts all the code in the kernel.
      
      This renaming patch is the first one amongst the patches, so that the
      remaining patches can stay flexible and can be merged and split up
      without having some big monolithic patch act as a merge barrier.
      
      [akpm@osdl.org: build fix]
      [akpm@osdl.org: another build fix]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d1bef4ed
  20. 23 6月, 2006 1 次提交
  21. 15 6月, 2006 1 次提交
    • J
      [POWERPC] MSI abstraction · 204face4
      Jake Moilanen 提交于
      Instead of trying to make PPC64 MSI fit in a Intel-centric MSI layer, a
      simple short-term solution is to hook the pci_{en/dis}able_msi() calls
      and make a machdep call.
      
      The rest of the MSI functions are superfluous for what is needed at this
      time.  Many of which can have machdep calls added as needed.
      
      Ben and Michael Ellerman are looking into rewrite the MSI layer to be
      more generic.  However, in the meantime this works as a interim
      solution.
      Signed-off-by: NJake Moilanen <moilanen@austin.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      204face4
  22. 04 4月, 2006 1 次提交
  23. 29 3月, 2006 1 次提交
  24. 23 3月, 2006 1 次提交
    • A
      [PATCH] more for_each_cpu() conversions · 394e3902
      Andrew Morton 提交于
      When we stop allocating percpu memory for not-possible CPUs we must not touch
      the percpu data for not-possible CPUs at all.  The correct way of doing this
      is to test cpu_possible() or to use for_each_cpu().
      
      This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
      few instances of this bug, if any.  But the patch converts lots of open-coded
      test to use the preferred helper macros.
      
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Acked-by: NKyle McMartin <kyle@parisc-linux.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Christian Zankel <chris@zankel.net>
      Cc: Philippe Elie <phil.el@wanadoo.fr>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Jens Axboe <axboe@suse.de>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      394e3902
  25. 24 2月, 2006 1 次提交
    • P
      powerpc: Implement accurate task and CPU time accounting · c6622f63
      Paul Mackerras 提交于
      This implements accurate task and cpu time accounting for 64-bit
      powerpc kernels.  Instead of accounting a whole jiffy of time to a
      task on a timer interrupt because that task happened to be running at
      the time, we now account time in units of timebase ticks according to
      the actual time spent by the task in user mode and kernel mode.  We
      also count the time spent processing hardware and software interrupts
      accurately.  This is conditional on CONFIG_VIRT_CPU_ACCOUNTING.  If
      that is not set, we do tick-based approximate accounting as before.
      
      To get this accurate information, we read either the PURR (processor
      utilization of resources register) on POWER5 machines, or the timebase
      on other machines on
      
      * each entry to the kernel from usermode
      * each exit to usermode
      * transitions between process context, hard irq context and soft irq
        context in kernel mode
      * context switches.
      
      On POWER5 systems with shared-processor logical partitioning we also
      read both the PURR and the timebase at each timer interrupt and
      context switch in order to determine how much time has been taken by
      the hypervisor to run other partitions ("steal" time).  Unfortunately,
      since we need values of the PURR on both threads at the same time to
      accurately calculate the steal time, and since we can only calculate
      steal time on a per-core basis, the apportioning of the steal time
      between idle time (time which we ceded to the hypervisor in the idle
      loop) and actual stolen time is somewhat approximate at the moment.
      
      This is all based quite heavily on what s390 does, and it uses the
      generic interfaces that were added by the s390 developers,
      i.e. account_system_time(), account_user_time(), etc.
      
      This patch doesn't add any new interfaces between the kernel and
      userspace, and doesn't change the units in which time is reported to
      userspace by things such as /proc/stat, /proc/<pid>/stat, getrusage(),
      times(), etc.  Internally the various task and cpu times are stored in
      timebase units, but they are converted to USER_HZ units (1/100th of a
      second) when reported to userspace.  Some precision is therefore lost
      but there should not be any accumulating error, since the internal
      accumulation is at full precision.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      c6622f63
  26. 10 2月, 2006 1 次提交
  27. 13 1月, 2006 1 次提交
    • D
      [PATCH] powerpc: Remove lppaca structure from the PACA · 3356bb9f
      David Gibson 提交于
      At present the lppaca - the structure shared with the iSeries
      hypervisor and phyp - is contained within the PACA, our own low-level
      per-cpu structure.  This doesn't have to be so, the patch below
      removes it, making a separate array of lppaca structures.
      
      This saves approximately 500*NR_CPUS bytes of image size and kernel
      memory, because we don't need aligning gap between the Linux and
      hypervisor portions of every PACA.  On the other hand it means an
      extra level of dereference in many accesses to the lppaca.
      
      The patch also gets rid of several places where we assign the paca
      address to a local variable for no particular reason.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3356bb9f
  28. 09 1月, 2006 2 次提交
  29. 14 11月, 2005 2 次提交
  30. 09 11月, 2005 3 次提交
  31. 02 11月, 2005 1 次提交
  32. 01 11月, 2005 1 次提交