1. 12 6月, 2015 3 次提交
  2. 19 5月, 2015 1 次提交
    • J
      genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU · 0a4377de
      Jiang Liu 提交于
      With Posted-Interrupts support in Intel CPU and IOMMU, an external
      interrupt from assigned-devices could be directly delivered to a
      virtual CPU in a virtual machine. Instead of hacking KVM and Intel
      IOMMU drivers, we propose a platform independent interface to target
      an interrupt to a specific virtual CPU in a virtual machine, or set
      virtual CPU affinity for an interrupt.
      
      By adopting this new interface and the hierarchy irqdomain, we could
      easily support posted-interrupts on Intel platforms, and also provide
      flexible enough interfaces for other platforms to support similar
      features.
      
      Here is the usage scenario for this interface:
      Guest update MSI/MSI-X interrupt configuration
              -->QEMU and KVM handle this
              -->KVM call this interface (passing posted interrupts descriptor
                 and guest vector)
              -->irq core will transfer the control to IOMMU
              -->IOMMU will do the real work of updating IRTE (IRTE has new
                 format for VT-d Posted-Interrupts)
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: NFeng Wu <feng.wu@intel.com>
      Link: http://lkml.kernel.org/r/1432026437-16560-2-git-send-email-feng.wu@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      0a4377de
  3. 09 4月, 2015 2 次提交
    • M
      genirq: Allow the irqchip state of an IRQ to be save/restored · 1b7047ed
      Marc Zyngier 提交于
      There is a number of cases where a kernel subsystem may want to
      introspect the state of an interrupt at the irqchip level:
      
      - When a peripheral is shared between virtual machines,
        its interrupt state becomes part of the guest's state,
        and must be switched accordingly. KVM on arm/arm64 requires
        this for its guest-visible timer
      - Some GPIO controllers seem to require peeking into the
        interrupt controller they are connected to to report
        their internal state
      
      This seem to be a pattern that is common enough for the core code
      to try and support this without too many horrible hacks. Introduce
      a pair of accessors (irq_get_irqchip_state/irq_set_irqchip_state)
      to retrieve the bits that can be of interest to another subsystem:
      pending, active, and masked.
      
      - irq_get_irqchip_state returns the state of the interrupt according
        to a parameter set to IRQCHIP_STATE_PENDING, IRQCHIP_STATE_ACTIVE,
        IRQCHIP_STATE_MASKED or IRQCHIP_STATE_LINE_LEVEL.
      - irq_set_irqchip_state similarly sets the state of the interrupt.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NBjorn Andersson <bjorn.andersson@sonymobile.com>
      Tested-by: NBjorn Andersson <bjorn.andersson@sonymobile.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Phong Vo <pvo@apm.com>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Tin Huynh <tnhuynh@apm.com>
      Cc: Y Vo <yvo@apm.com>
      Cc: Toan Le <toanle@apm.com>
      Cc: Bjorn Andersson <bjorn@kryo.se>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Link: http://lkml.kernel.org/r/1426676484-21812-2-git-send-email-marc.zyngier@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1b7047ed
    • M
      genirq: MSI: Fix freeing of unallocated MSI · fe0c52fc
      Marc Zyngier 提交于
      While debugging an unrelated issue with the GICv3 ITS driver, the
      following trace triggered:
      
      WARNING: CPU: 1 PID: 1 at kernel/irq/irqdomain.c:1121 irq_domain_free_irqs+0x160/0x17c()
      NULL pointer, cannot free irq
      Modules linked in:
      CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W      3.19.0-rc6+ #3690
      Hardware name: FVP Base (DT)
      Call trace:
      [<ffffffc000089398>] dump_backtrace+0x0/0x13c
      [<ffffffc0000894e4>] show_stack+0x10/0x1c
      [<ffffffc00066d134>] dump_stack+0x74/0x94
      [<ffffffc0000a92f8>] warn_slowpath_common+0x9c/0xd4
      [<ffffffc0000a938c>] warn_slowpath_fmt+0x5c/0x80
      [<ffffffc0000ee04c>] irq_domain_free_irqs+0x15c/0x17c
      [<ffffffc0000ef918>] msi_domain_free_irqs+0x58/0x74
      [<ffffffc000386f58>] free_msi_irqs+0xb4/0x1c0
      
          // The msi_prepare callback fails here
      
      [<ffffffc0003872c0>] pci_enable_msix+0x25c/0x3d4
      [<ffffffc00038746c>] pci_enable_msix_range+0x34/0x80
      [<ffffffc0003924ac>] vp_try_to_find_vqs+0xec/0x528
      [<ffffffc000392954>] vp_find_vqs+0x6c/0xa8
      [<ffffffc0003ee2a8>] init_vq+0x120/0x248
      [<ffffffc0003eefb0>] virtblk_probe+0xb0/0x6bc
      [<ffffffc00038fc34>] virtio_dev_probe+0x17c/0x214
      [<ffffffc0003d4a04>] driver_probe_device+0x7c/0x23c
      [<ffffffc0003d4cb0>] __driver_attach+0x98/0xa0
      [<ffffffc0003d2c60>] bus_for_each_dev+0x60/0xb4
      [<ffffffc0003d455c>] driver_attach+0x1c/0x28
      [<ffffffc0003d41b0>] bus_add_driver+0x150/0x208
      [<ffffffc0003d54c0>] driver_register+0x64/0x130
      [<ffffffc00038f9e8>] register_virtio_driver+0x24/0x68
      [<ffffffc00091320c>] init+0x70/0xac
      [<ffffffc0000828f0>] do_one_initcall+0x94/0x1d0
      [<ffffffc0008e9b00>] kernel_init_freeable+0x144/0x1e4
      [<ffffffc00066a434>] kernel_init+0xc/0xd8
      ---[ end trace f9ee562a77cc7bae ]---
      
      The ITS msi_prepare callback having failed, we end-up trying to
      free MSIs that have never been allocated. Oddly enough, the kernel
      is pretty upset about it.
      
      It turns out that this behaviour was expected before the MSI domain
      was introduced (and dealt with in arch_teardown_msi_irqs).
      
      The obvious fix is to detect this early enough and bail out.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NJiang Liu <jiang.liu@linux.intel.com>
      Link: http://lkml.kernel.org/r/1422299419-6051-1-git-send-email-marc.zyngier@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fe0c52fc
  4. 15 3月, 2015 1 次提交
  5. 05 3月, 2015 1 次提交
    • R
      genirq / PM: Add flag for shared NO_SUSPEND interrupt lines · 17f48034
      Rafael J. Wysocki 提交于
      It currently is required that all users of NO_SUSPEND interrupt
      lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the
      WARN_ON_ONCE() in irq_pm_install_action() will trigger.  That is
      done to warn about situations in which unprepared interrupt handlers
      may be run unnecessarily for suspended devices and may attempt to
      access those devices by mistake.  However, it may cause drivers
      that have no technical reasons for using IRQF_NO_SUSPEND to set
      that flag just because they happen to share the interrupt line
      with something like a timer.
      
      Moreover, the generic handling of wakeup interrupts introduced by
      commit 9ce7a258 (genirq: Simplify wakeup mechanism) only works
      for IRQs without any NO_SUSPEND users, so the drivers of wakeup
      devices needing to use shared NO_SUSPEND interrupt lines for
      signaling system wakeup generally have to detect wakeup in their
      interrupt handlers.  Thus if they happen to share an interrupt line
      with a NO_SUSPEND user, they also need to request that their
      interrupt handlers be run after suspend_device_irqs().
      
      In both cases the reason for using IRQF_NO_SUSPEND is not because
      the driver in question has a genuine need to run its interrupt
      handler after suspend_device_irqs(), but because it happens to
      share the line with some other NO_SUSPEND user.  Otherwise, the
      driver would do without IRQF_NO_SUSPEND just fine.
      
      To make it possible to specify that condition explicitly, introduce
      a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND,
      that, when set, will indicate to the IRQ core that the interrupt
      user is generally fine with suspending the IRQ, but it also can
      tolerate handler invocations after suspend_device_irqs() and, in
      particular, it is capable of detecting system wakeup and triggering
      it as appropriate from its interrupt handler.
      
      That will allow us to work around a problem with a shared timer
      interrupt line on at91 platforms.
      
      Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2
      Link: http://marc.info/?t=142252775300011&r=1&w=2
      Link: https://lkml.org/lkml/2014/12/15/552Reported-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      17f48034
  6. 18 2月, 2015 1 次提交
    • P
      genirq: Provide disable_hardirq() · 02cea395
      Peter Zijlstra 提交于
      For things like netpoll there is a need to disable an interrupt from
      atomic context. Currently netpoll uses disable_irq() which will
      sleep-wait on threaded handlers and thus forced_irqthreads breaks
      things.
      
      Provide disable_hardirq(), which uses synchronize_hardirq() to only wait
      for active hardirq handlers; also change synchronize_hardirq() to
      return the status of threaded handlers.
      
      This will allow one to try-disable an interrupt from atomic context, or
      in case of request_threaded_irq() to only wait for the hardirq part.
      Suggested-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Miller <davem@davemloft.net>
      Cc: Eyal Perry <eyalpe@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Quentin Lambert <lambert.quentin@gmail.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Link: http://lkml.kernel.org/r/20150205130623.GH5029@twins.programming.kicks-ass.net
      [ Fixed typos and such. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      02cea395
  7. 14 2月, 2015 1 次提交
  8. 10 2月, 2015 1 次提交
  9. 23 1月, 2015 1 次提交
    • J
      genirq: Set initial affinity in irq_set_affinity_hint() · e2e64a93
      Jesse Brandeburg 提交于
      Problem:
      The default behavior of the kernel is somewhat undesirable as all
      requested interrupts end up on CPU0 after registration.  A user can
      run irqbalance daemon, or can manually configure smp_affinity via the
      proc filesystem, but the default affinity of the interrupts for all
      devices is always CPU zero, this can cause performance problems or
      very heavy cpu use of only one core if not noticed and fixed by the
      user.
      
      Solution:
      Enable the setting of the initial affinity directly when the driver
      sets a hint.
      
      This enabling means that kernel drivers can include an initial
      affinity setting for the interrupt, instead of all interrupts starting
      out life on CPU0. Of course if irqbalance is still running then the
      interrupts will get moved as before.
      
      This function is currently called by drivers in block, crypto,
      infiniband, ethernet and scsi trees, but only a handful, so these will
      be the devices affected by this change.
      
      Tested on i40e, and default interrupts were spread across the CPUs
      according to the hint.
      
      drivers/block/mtip32xx/mtip32xx.c:3
      drivers/block/nvme-core.c:2
      drivers/crypto/qat/qat_dh895xcc/adf_isr.c:3
      drivers/infiniband/hw/qib/qib_iba7322.c:2
      drivers/net/ethernet/intel/i40e/i40e_main.c:3
      drivers/net/ethernet/intel/i40evf/i40evf_main.c:3
      drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3
      drivers/net/ethernet/mellanox/mlx4/en_cq.c:2
      drivers/scsi/hpsa.c:3
      drivers/scsi/lpfc/lpfc_init.c:3
      drivers/scsi/megaraid/megaraid_sas_base.c:8
      drivers/soc/ti/knav_qmss_acc.c:1
      drivers/soc/ti/knav_qmss_queue.c:2
      drivers/virtio/virtio_pci_common.c:2
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/20141219012206.4220.27491.stgit@jbrandeb-cp2.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      e2e64a93
  10. 13 12月, 2014 1 次提交
    • T
      genirq: Prevent proc race against freeing of irq descriptors · c291ee62
      Thomas Gleixner 提交于
      Since the rework of the sparse interrupt code to actually free the
      unused interrupt descriptors there exists a race between the /proc
      interfaces to the irq subsystem and the code which frees the interrupt
      descriptor.
      
      CPU0				CPU1
      				show_interrupts()
      				  desc = irq_to_desc(X);
      free_desc(desc)
        remove_from_radix_tree();
        kfree(desc);
      				  raw_spinlock_irq(&desc->lock);
      
      /proc/interrupts is the only interface which can actively corrupt
      kernel memory via the lock access. /proc/stat can only read from freed
      memory. Extremly hard to trigger, but possible.
      
      The interfaces in /proc/irq/N/ are not affected by this because the
      removal of the proc file is serialized in procfs against concurrent
      readers/writers. The removal happens before the descriptor is freed.
      
      For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
      as the descriptor is never freed. It's merely cleared out with the irq
      descriptor lock held. So any concurrent proc access will either see
      the old correct value or the cleared out ones.
      
      Protect the lookup and access to the irq descriptor in
      show_interrupts() with the sparse_irq_lock.
      
      Provide kstat_irqs_usr() which is protecting the lookup and access
      with sparse_irq_lock and switch /proc/stat to use it.
      
      Document the existing kstat_irqs interfaces so it's clear that the
      caller needs to take care about protection. The users of these
      interfaces are either not affected due to SPARSE_IRQ=n or already
      protected against removal.
      
      Fixes: 1f5a5b87 "genirq: Implement a sane sparse_irq allocator"
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      c291ee62
  11. 08 12月, 2014 1 次提交
  12. 23 11月, 2014 14 次提交
  13. 09 11月, 2014 2 次提交
  14. 25 9月, 2014 1 次提交
  15. 03 9月, 2014 1 次提交
    • M
      genirq: Add irq_domain-aware core IRQ handler · 76ba59f8
      Marc Zyngier 提交于
      Calling irq_find_mapping from outside a irq_{enter,exit} section is
      unsafe and produces ugly messages if CONFIG_PROVE_RCU is enabled:
      If coming from the idle state, the rcu_read_lock call in irq_find_mapping
      will generate an unpleasant warning:
      
      <quote>
      ===============================
      [ INFO: suspicious RCU usage. ]
      3.16.0-rc1+ #135 Not tainted
      -------------------------------
      include/linux/rcupdate.h:871 rcu_read_lock() used illegally while idle!
      
      other info that might help us debug this:
      
      RCU used illegally from idle CPU!
      rcu_scheduler_active = 1, debug_locks = 0
      RCU used illegally from extended quiescent state!
      1 lock held by swapper/0/0:
       #0:  (rcu_read_lock){......}, at: [<ffffffc00010206c>]
      irq_find_mapping+0x4c/0x198
      </quote>
      
      As this issue is fairly widespread and involves at least three
      different architectures, a possible solution is to add a new
      handle_domain_irq entry point into the generic IRQ code that
      the interrupt controller code can call.
      
      This new function takes an irq_domain, and calls into irq_find_domain
      inside the irq_{enter,exit} block. An additional "lookup" parameter is
      used to allow non-domain architecture code to be replaced by this as well.
      
      Interrupt controllers can then be updated to use the new mechanism.
      
      This code is sitting behind a new CONFIG_HANDLE_DOMAIN_IRQ, as not all
      architectures implement set_irq_regs (yes, mn10300, I'm looking at you...).
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Link: https://lkml.kernel.org/r/1409047421-27649-2-git-send-email-marc.zyngier@arm.comSigned-off-by: NJason Cooper <jason@lakedaemon.net>
      76ba59f8
  16. 01 9月, 2014 8 次提交