1. 28 4月, 2009 4 次提交
    • Y
      irq: change ACPI GSI APIs to also take a device argument · a2f809b0
      Yinghai Lu 提交于
      We want to use dev_to_node() later on, to be aware of the 'home node'
      of the GSI in question.
      
      [ Impact: cleanup, prepare the IRQ code to be more NUMA aware ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NLen Brown <lenb@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-acpi@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      LKML-Reference: <49F65560.20904@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a2f809b0
    • Y
      x86/irq: change irq_desc_alloc() to take node instead of cpu · 85ac16d0
      Yinghai Lu 提交于
      This simplifies the node awareness of the code. All our allocators
      only deal with a NUMA node ID locality not with CPU ids anyway - so
      there's no need to maintain (and transform) a CPU id all across the
      IRq layer.
      
      v2: keep move_irq_desc related
      
      [ Impact: cleanup, prepare IRQ code to be NUMA-aware ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      LKML-Reference: <49F65536.2020300@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      85ac16d0
    • Y
      irq: change ->set_affinity() to return status · d5dedd45
      Yinghai Lu 提交于
      according to Ingo, change set_affinity() in irq_chip should return int,
      because that way we can handle failure cases in a much cleaner way, in
      the genirq layer.
      
      v2: fix two typos
      
      [ Impact: extend API ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: linux-arch@vger.kernel.org
      LKML-Reference: <49F654E9.4070809@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d5dedd45
    • Y
      x86/irq: remove leftover code from NUMA_MIGRATE_IRQ_DESC · fcef5911
      Yinghai Lu 提交于
      The original feature of migrating irq_desc dynamic was too fragile
      and was causing problems: it caused crashes on systems with lots of
      cards with MSI-X when user-space irq-balancer was enabled.
      
      We now have new patches that create irq_desc according to device
      numa node. This patch removes the leftover bits of the dynamic balancer.
      
      [ Impact: remove dead code ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <49F654AF.8000808@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fcef5911
  2. 14 4月, 2009 1 次提交
    • P
      x86, irq: Remove IRQ_DISABLED check in process context IRQ move · 6ec3cfec
      Pallipadi, Venkatesh 提交于
      As discussed in the thread here:
      
        http://marc.info/?l=linux-kernel&m=123964468521142&w=2
      
      Eric W. Biederman observed:
      
      > It looks like some additional bugs have slipped in since last I looked.
      >
      > set_irq_affinity does this:
      > ifdef CONFIG_GENERIC_PENDING_IRQ
      >        if (desc->status & IRQ_MOVE_PCNTXT || desc->status & IRQ_DISABLED) {
      >                cpumask_copy(desc->affinity, cpumask);
      >                desc->chip->set_affinity(irq, cpumask);
      >        } else {
      >                desc->status |= IRQ_MOVE_PENDING;
      >                cpumask_copy(desc->pending_mask, cpumask);
      >        }
      > #else
      >
      > That IRQ_DISABLED case is a software state and as such it has nothing to
      > do with how safe it is to move an irq in process context.
      
      [...]
      
      >
      > The only reason we migrate MSIs in interrupt context today is that there
      > wasn't infrastructure for support migration both in interrupt context
      > and outside of it.
      
      Yes. The idea here was to force the MSI migration to happen in process
      context. One of the patches in the series did
      
              disable_irq(dev->irq);
              irq_set_affinity(dev->irq, cpumask_of(dev->cpu));
              enable_irq(dev->irq);
      
      with the above patch adding irq/manage code check for interrupt disabled
      and moving the interrupt in process context.
      
      IIRC, there was no IRQ_MOVE_PCNTXT when we were developing this HPET
      code and we ended up having this ugly hack. IRQ_MOVE_PCNTXT was there
      when we eventually submitted the patch upstream. But, looks like I did a
      blind rebasing instead of using IRQ_MOVE_PCNTXT in hpet MSI code.
      
      Below patch fixes this. i.e., revert commit 932775a4
      and add PCNTXT to HPET MSI setup. Also removes copying of desc->affinity
      in generic code as set_affinity routines are doing it internally.
      Reported-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Li Shaohua" <shaohua.li@intel.com>
      Cc: Gary Hade <garyhade@us.ibm.com>
      Cc: "lcm@us.ibm.com" <lcm@us.ibm.com>
      Cc: suresh.b.siddha@intel.com
      LKML-Reference: <20090413222058.GB8211@linux-os.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ec3cfec
  3. 10 4月, 2009 1 次提交
    • W
      x86, intr-remap: fix eoi for interrupt remapping without x2apic · 746cddd3
      Weidong Han 提交于
      To simplify level irq migration in the presence of interrupt-remapping,
      Suresh used a virtual vector (io-apic pin number) to eliminate io-apic
      RTE modification. Level triggered interrupt will appear as an edge to
      the local apic cpu but still as level to the IO-APIC. So in addition to
      do the local apic EOI, it still needs to do IO-APIC directed EOI to clear
      the remote IRR bit in the IO-APIC RTE. Pls refer to Suresh's patch for
      more details (commit 0280f7c4).
      
      Now interrupt remapping is decoupled from x2apic, it also needs to do the
      directed EOI for apic. Otherwise, apic interrupts won't work correctly.
      Signed-off-by: NWeidong Han <weidong.han@intel.com>
      Cc: iommu@lists.linux-foundation.org
      Cc: Weidong Han <weidong.han@intel.com>
      Cc: suresh.b.siddha@intel.com
      Cc: dwmw2@infradead.org
      Cc: allen.m.kay@intel.com
      LKML-Reference: <1239355037-22856-1-git-send-email-weidong.han@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      746cddd3
  4. 04 4月, 2009 2 次提交
  5. 26 3月, 2009 1 次提交
    • R
      x86: Correct behaviour of irq affinity · e06b1b56
      Rusty Russell 提交于
      Impact: get correct smp_affinity as user requested
      
      The effect of setting desc->affinity (ie. from userspace via sysfs) has
      varied over time.  In 2.6.27, the 32-bit code anded the value with
      cpu_online_map, and both 32 and 64-bit did that anding whenever a cpu
      was unplugged.
      
      2.6.29 consolidated this into one routine (and fixed hotplug) but
      introduced another variation: anding the affinity with cfg->domain.
      
      We should just set it to what the user said - if possible.
      
      (cpu_mask_to_apicid_and already takes cpu_online_mask into account)
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <49C94DDF.2010703@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e06b1b56
  6. 25 3月, 2009 1 次提交
    • Y
      x86: fix set_extra_move_desc calling · fa74c907
      Yinghai Lu 提交于
      Impact: fix bug with irq-descriptor moving when logical flat
      
      Rusty observed:
      
      > The effect of setting desc->affinity (ie. from userspace via sysfs) has varied
      > over time.  In 2.6.27, the 32-bit code anded the value with cpu_online_map,
      > and both 32 and 64-bit did that anding whenever a cpu was unplugged.
      >
      > 2.6.29 consolidated this into one routine (and fixed hotplug) but introduced
      > another variation: anding the affinity with cfg->domain.  Is this right, or
      > should we just set it to what the user said?  Or as now, indicate that we're
      > restricting it.
      
      Eric pointed out that desc->affinity should be what the user requested,
      if it is at all possible to honor the user space request.
      
      This bug got introduced by commit 22f65d31 "x86: Update io_apic.c to use
      new cpumask API".
      
      Fix it by moving the masking to before the descriptor moving ...
      Reported-by: NRusty Russell <rusty@rustcorp.com.au>
      Reported-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <49C94134.4000408@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa74c907
  7. 23 3月, 2009 2 次提交
  8. 21 3月, 2009 1 次提交
  9. 18 3月, 2009 6 次提交
    • S
      x86: fix broken irq migration logic while cleaning up multiple vectors · 68a8ca59
      Suresh Siddha 提交于
      Impact: fix spurious IRQs
      
      During irq migration, we send a low priority interrupt to the previous
      irq destination. This happens in non interrupt-remapping case after interrupt
      starts arriving at new destination and in interrupt-remapping case after
      modifying and flushing the interrupt-remapping table entry caches.
      
      This low priority irq cleanup handler can cleanup multiple vectors, as
      multiple irq's can be migrated at almost the same time. While
      there will be multiple invocations of irq cleanup handler (one cleanup
      IPI for each irq migration), first invocation of the cleanup handler
      can potentially cleanup more than one vector (as the first invocation can
      see the requests for more than vector cleanup). When we cleanup multiple
      vectors during the first invocation of the smp_irq_move_cleanup_interrupt(),
      other vectors that are to be cleanedup can still be pending in the local
      cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled).
      
      When we are ready to unhook a vector corresponding to an irq, check if that
      vector is registered in the local cpu's IRR. If so skip that cleanup and
      do a self IPI with the cleanup vector, so that we give a chance to
      service the pending vector interrupt and then cleanup that vector
      allocation once we execute the lowest priority handler.
      
      This fixes spurious interrupts seen when migrating multiple vectors
      at the same time.
      
      [ This is apparently possible even on conventional xapic, although to
        the best of our knowledge it has never been seen.  The stable
        maintainers may wish to consider this one for -stable. ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: stable@kernel.org
      68a8ca59
    • S
      x86, ioapic: Fix non atomic allocation with interrupts disabled · 05c3dc2c
      Suresh Siddha 提交于
      Impact: fix possible race
      
      save_mask_IO_APIC_setup() was using non atomic memory allocation while getting
      called with interrupts disabled. Fix this by splitting this into two different
      function. Allocation part save_IO_APIC_setup() now happens before
      disabling interrupts.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      05c3dc2c
    • S
      x86, x2apic: cleanup ifdef CONFIG_INTR_REMAP in io_apic code · 29b61be6
      Suresh Siddha 提交于
      Impact: cleanup
      
      Clean up #ifdefs and replace them with helper functions.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      29b61be6
    • S
      x86, x2apic: cleanup the IO-APIC level migration with interrupt-remapping · 0280f7c4
      Suresh Siddha 提交于
      Impact: simplification
      
      In the current code, for level triggered migration, we need to modify the
      io-apic RTE with the update vector information, along with modifying interrupt
      remapping table entry(IRTE) with vector and destination. This is to ensure that
      remote IRR bit inthe IOAPIC RTE gets cleared when the cpu does EOI.
      
      With this patch, for level triggered, we eliminate the io-apic RTE modification
      (with the updated vector information), by using a virtual vector (io-apic pin
      number).  Real vector that is used for interrupting cpu will be coming from
      the interrupt-remapping table entry. Trigger mode in the IRTE will always be
      edge, and the actual level or edge trigger will be setup in the IO-APIC RTE.
      So a level triggered interrupt will appear as an edge to the local apic
      cpu but still as level to the IO-APIC.
      
      With this change, level irq migration can be done by simply modifying
      the interrupt-remapping table entry with out changing the io-apic RTE.
      And as the interrupt appears as edge at the cpu, in addition to do the
      local apic EOI, we need to do IO-APIC directed EOI to clear the remote
      IRR bit in  the IO-APIC RTE.
      
      This simplies the irq migration in the presence of interrupt-remapping.
      Idea-by: NRajesh Sankaran <rajesh.sankaran@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      0280f7c4
    • S
      x86, x2apic: use virtual wire A mode in disable_IO_APIC() with interrupt-remapping · 7c6d9f97
      Suresh Siddha 提交于
      Impact: make kexec work with x2apic
      
      disable_IO_APIC() gets called during crashdump aswell, which configures the
      IO-APIC/LAPIC so that legacy interrupts can be delivered for the kexec'd kernel.
      
      In the presence of interrupt-remapping, we need to change the
      interrupt-remapping configuration aswell as modifying IO-APIC for virtual wire
      B mode.
      
      To keep things simple during the crash, use virtual wire A mode
      (for which we don't need to touch io-apic and interrupt-remapping tables).
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      7c6d9f97
    • S
      x86, x2apic: enable fault handling for intr-remapping · 9d783ba0
      Suresh Siddha 提交于
      Impact: interface augmentation (not yet used)
      
      Enable fault handling flow for intr-remapping aswell. Fault handling
      code now shared by both dma-remapping and intr-remapping.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      9d783ba0
  10. 18 2月, 2009 2 次提交
  11. 16 2月, 2009 1 次提交
  12. 15 2月, 2009 1 次提交
  13. 10 2月, 2009 1 次提交
  14. 09 2月, 2009 5 次提交
    • Y
      x86: find nr_irqs_gsi with mp_ioapic_routing · 3f4a739c
      Yinghai Lu 提交于
      Impact: find right nr_irqs_gsi on some systems.
      
      One test-system has gap between gsi's:
      
      [    0.000000] ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
      [    0.000000] IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-23
      [    0.000000] ACPI: IOAPIC (id[0x05] address[0xfeafd000] gsi_base[48])
      [    0.000000] IOAPIC[1]: apic_id 5, version 0, address 0xfeafd000, GSI 48-54
      [    0.000000] ACPI: IOAPIC (id[0x06] address[0xfeafc000] gsi_base[56])
      [    0.000000] IOAPIC[2]: apic_id 6, version 0, address 0xfeafc000, GSI 56-62
      ...
      [    0.000000] nr_irqs_gsi: 38
      
      So nr_irqs_gsi is not right. some irq for MSI will overwrite with io_apic.
      
      need to get that with acpi_probe_gsi when acpi io_apic is used
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f4a739c
    • Y
      x86: find nr_irqs_gsi with mp_ioapic_routing · cc6c5006
      Yinghai Lu 提交于
      Impact: find right nr_irqs_gsi on some systems.
      
      One test-system has gap between gsi's:
      
      [    0.000000] ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
      [    0.000000] IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-23
      [    0.000000] ACPI: IOAPIC (id[0x05] address[0xfeafd000] gsi_base[48])
      [    0.000000] IOAPIC[1]: apic_id 5, version 0, address 0xfeafd000, GSI 48-54
      [    0.000000] ACPI: IOAPIC (id[0x06] address[0xfeafc000] gsi_base[56])
      [    0.000000] IOAPIC[2]: apic_id 6, version 0, address 0xfeafc000, GSI 56-62
      ...
      [    0.000000] nr_irqs_gsi: 38
      
      So nr_irqs_gsi is not right. some irq for MSI will overwrite with io_apic.
      
      need to get that with acpi_probe_gsi when acpi io_apic is used
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc6c5006
    • Y
      x86: check_timer cleanup · f72dccac
      Yinghai Lu 提交于
      Impact: make check-timer more robust potentially solve boot fragility
      
      For edge trigger io-apic routing, we already unmasked the pin via
      setup_IO_APIC_irq(), so don't unmask it again.
      
      Also call local_irq_disable() between timer_irq_works(), because it
      calls local_irq_enable() inside.
      
      Also remove not needed apic version reading for 64-bit
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f72dccac
    • Y
      x86: use NR_IRQS_LEGACY to replace 16 · abcaa2b8
      Yinghai Lu 提交于
      Impact: cleanup
      
      also could kill platform_legacy_irq
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      abcaa2b8
    • Y
      x86/irq: optimize nr_irqs · f1ee5548
      Yinghai Lu 提交于
      Impact: make nr_irqs depend more on cards used in a system
      
      depend on nr_irq_gsi more, and have a ratio for MSI.
      
      v2: make nr_irqs less than NR_VECTORS * nr_cpu_ids
          aka if only one cpu, we only can support nr_irqs = NR_VECTORS
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f1ee5548
  15. 06 2月, 2009 1 次提交
  16. 01 2月, 2009 1 次提交
    • Y
      irq, x86: fix lock status with numa_migrate_irq_desc · 10b888d6
      Yinghai Lu 提交于
      Eric Paris reported:
      
      > I have an hp dl785g5 which is unable to successfully run
      > 2.6.29-0.66.rc3.fc11.x86_64 or 2.6.29-rc2-next-20090126.  During bootup
      > (early in userspace daemons starting) I get the below BUG, which quickly
      > renders the machine dead.  I assume it is because sparse_irq_lock never
      > gets released when the BUG kills that task.
      
      Adjust lock sequence when migrating a descriptor with
      CONFIG_NUMA_MIGRATE_IRQ_DESC enabled.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      10b888d6
  17. 31 1月, 2009 1 次提交
  18. 29 1月, 2009 8 次提交