1. 03 5月, 2007 11 次提交
  2. 04 4月, 2007 1 次提交
    • E
      [PATCH] msi: synchronously mask and unmask msi-x irqs. · 348e3fd1
      Eric W. Biederman 提交于
      This is a simplified and actually more comprehensive form of a bug
      fix from Mitch Williams <mitch.a.williams@intel.com>.
      
      When we mask or unmask a msi-x irqs the writes may be posted because
      we are writing to memory mapped region.  This means the mask and
      unmask don't happen immediately but at some unspecified time in the
      future.  Which is out of sync with how the mask/unmask logic work
      for ioapic irqs.
      
      The practical result is that we get very subtle and hard to track down
      irq migration bugs.
      
      This patch performs a read flush after writes to the MSI-X table for mask
      and unmask operations.  Since the SMP affinity is set while the interrupt
      is masked, and since it's unmasked immediately after, no additional flushes
      are required in the various affinity setting routines.
      
      The testing by Mitch Williams on his especially problematic system should
      still be valid as I have only simplified the code, not changed the
      functionality.
      
      We currently have 7 drivers: cciss, mthca, cxgb3, forceth, s2io,
      pcie/portdrv_core, and qla2xxx in 2.6.21 that are affected by this
      problem when the hardware they driver is plugged into the right slot.
      
      Given the difficulty of reproducing this bug and tracing it down to
      anything that even remotely resembles a cause, even if people are
      being affected we aren't likely to see many meaningful bug reports, and
      the people who see this bug aren't likely to be able to reproduce this
      bug in a timely fashion.  So it is best to get this problem fixed
      as soon as we can so people don't have problems.
      
      Then if people do have a kernel message stating "No irq for vector" we
      will know it is yet another novel cause that needs a complete new
      investigation.
      
      Cc: Greg KH <greg@kroah.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NMitch Williams <mitch.a.williams@intel.com>
      Acked-by: N"Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      348e3fd1
  3. 13 3月, 2007 1 次提交
    • E
      [PATCH] msi: Safer state caching. · 392ee1e6
      Eric W. Biederman 提交于
      There are two ways pci_save_state and pci_restore_state are used.  As
      helper functions during suspend/resume, and as helper functions around
      a hardware reset event.  When used as helper functions around a hardware
      reset event there is no reason to believe the calls will be paired, nor
      is there a good reason to believe that if we restore the msi state from
      before the reset that it will match the current msi state.  Since arch
      code may change the msi message without going through the driver, drivers
      currently do not have enough information to even know when to call
      pci_save_state to ensure they will have msi state in sync with the other
      kernel irq reception data structures.
      
      It turns out the solution is straight forward, cache the state in the
      existing msi data structures (not the magic pci saved things) and
      have the msi code update the cached state each time we write to the hardware.
      This means we never need to read the hardware to figure out what the hardware
      state should be.
      
      By modifying the caching in this manner we get to remove our save_state
      routines and only need to provide restore_state routines.
      
      The only fields that were at all tricky to regenerate were the msi and msi-x
      control registers and the way we regenerate them currently is a bit dependent
      upon assumptions on how we use the allow msi registers to be configured and used
      making the code a little bit brittle.  If we ever change what cases we allow
      or how we configure the msi bits we can address the fragility then.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Acked-by: NAuke Kok <auke-jan.h.kok@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      392ee1e6
  4. 05 3月, 2007 3 次提交
    • E
      [PATCH] msi: support masking msi irqs without a mask bit · 58e0543e
      Eric W. Biederman 提交于
      For devices that do not support msi-x we only support 1 interrupt.  Therefore
      we can disable that one interrupt by disabling the msi capability itself.  If
      we leave the intx interrupts disabled while we have the msi capability
      disabled no interrupts should be delivered from that device.
      
      Devices with just the minimal msi support (and thus hitting this code path)
      include things like the intel e1000 nic, so it looks like is going to be a
      fairly common case and thus important to get right.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58e0543e
    • E
      [PATCH] msi: fix up the msi enable/disable logic · b1cbf4e4
      Eric W. Biederman 提交于
      enable/disable_msi_mode have several side effects which keeps them from being
      generally useful.  So this patch replaces them with with two much more
      targeted functions: msi_set_enable and msix_set_enable.
      
      This patch makes pci_dev->msi_enabled and pci_dev->msix_enabled the definitive
      way to test if linux has enabled the msi capability, and has the appropriate
      msi data structures set up.
      
      This patch ensures that while writing the msi messages in save/restore and
      during device initialization we have the msi capability disabled so we don't
      get into races.  The pci spec requires that we do not have the msi capability
      enabled and the msi messages unmasked while we write the messages.  Completely
      disabling the capability is overkill but it is easy :)
      
      Care has been taken so we never have both a msi capability and intx enabled
      simultaneously.  We haven't run into a problem yet but better safe then sorry.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b1cbf4e4
    • E
      [PATCH] msi: sanely support hardware level msi disabling · f5f2b131
      Eric W. Biederman 提交于
      In some cases when we are not using msi we need a way to ensure that the
      hardware does not have an msi capability enabled.  Currently the code has been
      calling disable_msi_mode to try and achieve that.  However disable_msi_mode
      has several other side effects and is only available when msi support is
      compiled in so it isn't really appropriate.
      
      Instead this patch implements pci_msi_off which disables all msi and msix
      capabilities unconditionally with no additional side effects.
      
      pci_disable_device was redundantly clearing the bus master enable flag and
      clearing the msi enable bit.  A device that is not allowed to perform bus
      mastering operations cannot generate intx or msi interrupt messages as those
      are essentially a special case of dma, and require bus mastering.  So the call
      in pci_disable_device to disable msi capabilities was redundant.
      
      quirk_pcie_pxh also called disable_msi_mode and is updated to use pci_msi_off.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f5f2b131
  5. 08 2月, 2007 10 次提交
  6. 08 12月, 2006 2 次提交
  7. 19 10月, 2006 1 次提交
  8. 04 10月, 2006 8 次提交
    • E
      [PATCH] msi: refactor and move the msi irq_chip into the arch code · 3b7d1921
      Eric W. Biederman 提交于
      It turns out msi_ops was simply not enough to abstract the architecture
      specific details of msi.  So I have moved the resposibility of constructing
      the struct irq_chip to the architectures, and have two architecture specific
      functions arch_setup_msi_irq, and arch_teardown_msi_irq.
      
      For simple architectures those functions can do all of the work.  For
      architectures with platform dependencies they can call into the appropriate
      platform code.
      
      With this msi.c is finally free of assuming you have an apic, and this
      actually takes less code.
      
      The helpers for the architecture specific code are declared in the linux/msi.h
      to keep them separate from the msi functions used by drivers in linux/pci.h
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3b7d1921
    • E
      [PATCH] msi: only use a single irq_chip for msi interrupts · 277bc33b
      Eric W. Biederman 提交于
      The logic works like this.
      
      Since we no longer track the state logic by hand in msi.c startup and shutdown
      are no longer needed.
      
      By updating msi_set_mask_bit to work on msi devices that do not implement a
      mask bit we can always call the mask/unmask functions.
      
      What we really have are mask and unmask so we use them to implement the .mask
      and .unmask functions instead of .enable and .disable.
      
      By switching to the handle_edge_irq handler we only need an ack function that
      moves the irq if necessary.  Which removes the old end and ack functions and
      their peculiar logic of sometimes disabling an irq.
      
      This removes the reliance on pre genirq irq handling methods.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      277bc33b
    • E
      [PATCH] msi: simplify msi sanity checks by adding with generic irq code · 1f80025e
      Eric W. Biederman 提交于
      Currently msi.c is doing sanity checks that make certain before an irq is
      destroyed it has no more users.
      
      By adding irq_has_action I can perform the test is a generic way, instead of
      relying on a msi specific data structure.
      
      By performing the core check in dynamic_irq_cleanup I ensure every user of
      dynamic irqs has a test present and we don't free resources that are in use.
      
      In msi.c this allows me to kill the attrib.state member of msi_desc and all of
      the assciated code to maintain it.
      
      To keep from freeing data structures when irq cleanup code is called to soon
      changing dyanamic_irq_cleanup is insufficient because there are msi specific
      data structures that are also not safe to free.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1f80025e
    • E
      [PATCH] genirq: msi: make the msi code irq based and not vector based · 1ce03373
      Eric W. Biederman 提交于
      The msi currently allocates irqs backwards.  First it allocates a platform
      dependent routing value for an interrupt the ``vector'' and then it figures
      out from the vector which irq you are on.
      
      For ia64 this is fine.  For x86 and x86_64 this is complete nonsense and makes
      an enourmous mess of the irq handling code and prevents some pretty
      significant cleanups in the code for handling large numbers of irqs.
      
      This patch refactors msi.c to work in terms of irqs and create_irq/destroy_irq
      for dynamically managing irqs.
      
      Hopefully this is finally a version of msi.c that is useful on more than just
      x86 derivatives.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1ce03373
    • E
      [PATCH] genirq: msi: simplify the msi irq limit policy · 92db6d10
      Eric W. Biederman 提交于
      Currently we attempt to predict how many irqs we will be able to allocate with
      msi using pci_vector_resources and some complicated accounting, and then we
      only allow each device as many irqs as we think are available on average.
      
      Only the s2io driver even takes advantage of this feature all other drivers
      have a fixed number of irqs they need and bail if they can't get them.
      
      pci_vector_resources is inaccurate if anyone ever frees an irq.  The whole
      implmentation is racy.  The current irq limit policy does not appear to make
      sense with current drivers.  So I have simplified things.  We can revisit this
      we we need a more sophisticated policy.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      92db6d10
    • E
      [PATCH] genirq: msi: refactor the msi_ops · 38bc0361
      Eric W. Biederman 提交于
      The current msi_ops are short sighted in a number of ways, this patch attempts
      to fix the glaring deficiences.
      
      - Report in msi_ops if a 64bit address is needed in the msi message, so we
        can fail 32bit only msi structures.
      
      - Send and receive a full struct msi_msg in both setup and target.  This is
        a little cleaner and allows for architectures that need to modify the data
        to retarget the msi interrupt to a different cpu.
      
      - In target pass in the full cpu mask instead of just the first cpu in case
        we can make use of the full cpu mask.
      
      - Operate in terms of irqs and not vectors, currently there is still a 1-1
        relationship but on architectures other than ia64 I expect this will change.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      38bc0361
    • E
      [PATCH] genirq: msi: implement helper functions read_msi_msg and write_msi_msg · 0366f8f7
      Eric W. Biederman 提交于
      In support of this I also add a struct msi_msg that captures the the two
      address and one data field ina typical msi message, and I remember the pos and
      if the address is 64bit in struct msi_desc.
      
      This makes the code a little more readable and easier to maintain, and paves
      the way to further simplfications.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0366f8f7
    • E
      [PATCH] genirq: msi: simplify msi enable and disable · 7bd007e4
      Eric W. Biederman 提交于
      The problem.  Because the disable routines leave the msi interrupts in all
      sorts of half enabled states the enable routines become impossible to
      implement correctly, and almost impossible to understand.
      
      Simplifing this allows me to simply kill the buggy reroute_msix_table, and
      generally makes the code more maintainable.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7bd007e4
  9. 27 9月, 2006 2 次提交
  10. 13 7月, 2006 1 次提交