1. 18 3月, 2014 1 次提交
    • R
      xen: add support for MSI message groups · 4892c9b4
      Roger Pau Monne 提交于
      Add support for MSI message groups for Xen Dom0 using the
      MAP_PIRQ_TYPE_MULTI_MSI pirq map type.
      
      In order to keep track of which pirq is the first one in the group all
      pirqs in the MSI group except for the first one have the newly
      introduced PIRQ_MSI_GROUP flag set. This prevents calling
      PHYSDEVOP_unmap_pirq on them, since the unmap must be done with the
      first pirq in the group.
      Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      4892c9b4
  2. 01 3月, 2014 1 次提交
  3. 06 1月, 2014 3 次提交
  4. 21 6月, 2013 1 次提交
    • S
      x86, trace: Add irq vector tracepoints · cf910e83
      Seiji Aguchi 提交于
      [Purpose of this patch]
      
      As Vaibhav explained in the thread below, tracepoints for irq vectors
      are useful.
      
      http://www.spinics.net/lists/mm-commits/msg85707.html
      
      <snip>
      The current interrupt traces from irq_handler_entry and irq_handler_exit
      provide when an interrupt is handled.  They provide good data about when
      the system has switched to kernel space and how it affects the currently
      running processes.
      
      There are some IRQ vectors which trigger the system into kernel space,
      which are not handled in generic IRQ handlers.  Tracing such events gives
      us the information about IRQ interaction with other system events.
      
      The trace also tells where the system is spending its time.  We want to
      know which cores are handling interrupts and how they are affecting other
      processes in the system.  Also, the trace provides information about when
      the cores are idle and which interrupts are changing that state.
      <snip>
      
      On the other hand, my usecase is tracing just local timer event and
      getting a value of instruction pointer.
      
      I suggested to add an argument local timer event to get instruction pointer before.
      But there is another way to get it with external module like systemtap.
      So, I don't need to add any argument to irq vector tracepoints now.
      
      [Patch Description]
      
      Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
      But there is an above use case to trace specific irq_vector rather than tracing all events.
      In this case, we are concerned about overhead due to unwanted events.
      
      So, add following tracepoints instead of introducing irq_vector_entry/exit.
      so that we can enable them independently.
         - local_timer_vector
         - reschedule_vector
         - call_function_vector
         - call_function_single_vector
         - irq_work_entry_vector
         - error_apic_vector
         - thermal_apic_vector
         - threshold_apic_vector
         - spurious_apic_vector
         - x86_platform_ipi_vector
      
      Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty
      makes a zero when tracepoints are disabled. Detailed explanations are as follows.
       - Create trace irq handlers with entering_irq()/exiting_irq().
       - Create a new IDT, trace_idt_table, at boot time by adding a logic to
         _set_gate(). It is just a copy of original idt table.
       - Register the new handlers for tracpoints to the new IDT by introducing
         macros to alloc_intr_gate() called at registering time of irq_vector handlers.
       - Add checking, whether irq vector tracing is on/off, into load_current_idt().
         This has to be done below debug checking for these reasons.
         - Switching to debug IDT may be kicked while tracing is enabled.
         - On the other hands, switching to trace IDT is kicked only when debugging
           is disabled.
      
      In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being
      used for other purposes.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      cf910e83
  5. 17 4月, 2013 1 次提交
  6. 17 8月, 2012 1 次提交
  7. 14 9月, 2012 1 次提交
  8. 20 7月, 2012 1 次提交
    • O
      xen PVonHVM: move shared_info to MMIO before kexec · 00e37bdb
      Olaf Hering 提交于
      Currently kexec in a PVonHVM guest fails with a triple fault because the
      new kernel overwrites the shared info page. The exact failure depends on
      the size of the kernel image. This patch moves the pfn from RAM into
      MMIO space before the kexec boot.
      
      The pfn containing the shared_info is located somewhere in RAM. This
      will cause trouble if the current kernel is doing a kexec boot into a
      new kernel. The new kernel (and its startup code) can not know where the
      pfn is, so it can not reserve the page. The hypervisor will continue to
      update the pfn, and as a result memory corruption occours in the new
      kernel.
      
      One way to work around this issue is to allocate a page in the
      xen-platform pci device's BAR memory range. But pci init is done very
      late and the shared_info page is already in use very early to read the
      pvclock. So moving the pfn from RAM to MMIO is racy because some code
      paths on other vcpus could access the pfn during the small   window when
      the old pfn is moved to the new pfn. There is even a  small window were
      the old pfn is not backed by a mfn, and during that time all reads
      return -1.
      
      Because it is not known upfront where the MMIO region is located it can
      not be used right from the start in xen_hvm_init_shared_info.
      
      To minimise trouble the move of the pfn is done shortly before kexec.
      This does not eliminate the race because all vcpus are still online when
      the syscore_ops will be called. But hopefully there is no work pending
      at this point in time. Also the syscore_op is run last which reduces the
      risk further.
      Signed-off-by: NOlaf Hering <olaf@aepfle.de>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      00e37bdb
  9. 22 5月, 2012 1 次提交
  10. 22 11月, 2011 1 次提交
    • D
      xen/event: Add reference counting to event channels · 420eb554
      Daniel De Graaf 提交于
      Event channels exposed to userspace by the evtchn module may be used by
      other modules in an asynchronous manner, which requires that reference
      counting be used to prevent the event channel from being closed before
      the signals are delivered.
      
      The reference count on new event channels defaults to -1 which indicates
      the event channel is not referenced outside the kernel; evtchn_get fails
      if called on such an event channel. The event channels made visible to
      userspace by evtchn have a normal reference count.
      Signed-off-by: NDaniel De Graaf <dgdegra@tycho.nsa.gov>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      420eb554
  11. 12 7月, 2011 1 次提交
    • K
      xen/pci: Remove 'xen_allocate_pirq_gsi'. · 78316ada
      Konrad Rzeszutek Wilk 提交于
      In the past (2.6.38) the 'xen_allocate_pirq_gsi' would allocate
      an entry in a Linux IRQ -> {XEN_IRQ, type, event, ..} array. All
      of that has been removed in 2.6.39 and the Xen IRQ subsystem uses
      an linked list that is populated when the call to
      'xen_allocate_irq_gsi' (universally done from any of the xen_bind_*
      calls) is done. The 'xen_allocate_pirq_gsi' is a NOP and there is
      no need for it anymore so lets remove it.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      78316ada
  12. 14 4月, 2011 3 次提交
  13. 11 3月, 2011 8 次提交
  14. 28 2月, 2011 1 次提交
  15. 02 12月, 2010 1 次提交
    • S
      xen: fix MSI setup and teardown for PV on HVM guests · af42b8d1
      Stefano Stabellini 提交于
      When remapping MSIs into pirqs for PV on HVM guests, qemu is responsible
      for doing the actual mapping and unmapping.
      We only give qemu the desired pirq number when we ask to do the mapping
      the first time, after that we should be reading back the pirq number
      from qemu every time we want to re-enable the MSI.
      
      This fixes a bug in xen_hvm_setup_msi_irqs that manifests itself when
      trying to enable the same MSI for the second time: the old MSI to pirq
      mapping is still valid at this point but xen_hvm_setup_msi_irqs would
      try to assign a new pirq anyway.
      A simple way to reproduce this bug is to assign an MSI capable network
      card to a PV on HVM guest, if the user brings down the corresponding
      ethernet interface and up again, Linux would fail to enable MSIs on the
      device.
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      af42b8d1
  16. 23 10月, 2010 4 次提交
  17. 18 10月, 2010 5 次提交
  18. 23 7月, 2010 1 次提交
  19. 31 3月, 2009 1 次提交
  20. 21 8月, 2008 1 次提交
    • J
      xen: save previous spinlock when blocking · 168d2f46
      Jeremy Fitzhardinge 提交于
      A spinlock can be interrupted while spinning, so make sure we preserve
      the previous lock of interest if we're taking a lock from within an
      interrupt handler.
      
      We also need to deal with the case where the blocking path gets
      interrupted between testing to see if the lock is free and actually
      blocking.  If we get interrupted there and end up in the state where
      the lock is free but the irq isn't pending, then we'll block
      indefinitely in the hypervisor.  This fix is to make sure that any
      nested lock-takers will always leave the irq pending if there's any
      chance the outer lock became free.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      168d2f46
  21. 16 7月, 2008 1 次提交
    • J
      xen: implement Xen-specific spinlocks · 2d9e1e2f
      Jeremy Fitzhardinge 提交于
      The standard ticket spinlocks are very expensive in a virtual
      environment, because their performance depends on Xen's scheduler
      giving vcpus time in the order that they're supposed to take the
      spinlock.
      
      This implements a Xen-specific spinlock, which should be much more
      efficient.
      
      The fast-path is essentially the old Linux-x86 locks, using a single
      lock byte.  The locker decrements the byte; if the result is 0, then
      they have the lock.  If the lock is negative, then locker must spin
      until the lock is positive again.
      
      When there's contention, the locker spin for 2^16[*] iterations waiting
      to get the lock.  If it fails to get the lock in that time, it adds
      itself to the contention count in the lock and blocks on a per-cpu
      event channel.
      
      When unlocking the spinlock, the locker looks to see if there's anyone
      blocked waiting for the lock by checking for a non-zero waiter count.
      If there's a waiter, it traverses the per-cpu "lock_spinners"
      variable, which contains which lock each CPU is waiting on.  It picks
      one CPU waiting on the lock and sends it an event to wake it up.
      
      This allows efficient fast-path spinlock operation, while allowing
      spinning vcpus to give up their processor time while waiting for a
      contended lock.
      
      [*] 2^16 iterations is threshold at which 98% locks have been taken
      according to Thomas Friebel's Xen Summit talk "Preventing Guests from
      Spinning Around".  Therefore, we'd expect the lock and unlock slow
      paths will only be entered 2% of the time.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Lameter <clameter@linux-foundation.org>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Virtualization <virtualization@lists.linux-foundation.org>
      Cc: Xen devel <xen-devel@lists.xensource.com>
      Cc: Thomas Friebel <thomas.friebel@amd.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d9e1e2f
  22. 27 5月, 2008 1 次提交
    • J
      xen: implement save/restore · 0e91398f
      Jeremy Fitzhardinge 提交于
      This patch implements Xen save/restore and migration.
      
      Saving is triggered via xenbus, which is polled in
      drivers/xen/manage.c.  When a suspend request comes in, the kernel
      prepares itself for saving by:
      
      1 - Freeze all processes.  This is primarily to prevent any
          partially-completed pagetable updates from confusing the suspend
          process.  If CONFIG_PREEMPT isn't defined, then this isn't necessary.
      
      2 - Suspend xenbus and other devices
      
      3 - Stop_machine, to make sure all the other vcpus are quiescent.  The
          Xen tools require the domain to run its save off vcpu0.
      
      4 - Within the stop_machine state, it pins any unpinned pgds (under
          construction or destruction), performs canonicalizes various other
          pieces of state (mostly converting mfns to pfns), and finally
      
      5 - Suspend the domain
      
      Restore reverses the steps used to save the domain, ending when all
      the frozen processes are thawed.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      0e91398f