1. 12 1月, 2009 1 次提交
    • M
      cpumask: update irq_desc to use cpumask_var_t · 7f7ace0c
      Mike Travis 提交于
      Impact: reduce memory usage, use new cpumask API.
      
      Replace the affinity and pending_masks with cpumask_var_t's.  This adds
      to the significant size reduction done with the SPARSE_IRQS changes.
      
      The added functions (init_alloc_desc_masks & init_copy_desc_masks) are
      in the include file so they can be inlined (and optimized out for the
      !CONFIG_CPUMASKS_OFFSTACK case.)  [Naming chosen to be consistent with
      the other init*irq functions, as well as the backwards arg declaration
      of "from, to" instead of the more common "to, from" standard.]
      
      Includes a slight change to the declaration of struct irq_desc to embed
      the pending_mask within ifdef(CONFIG_SMP) to be consistent with other
      references, and some small changes to Xen.
      
      Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: virtualization@lists.osdl.org
      Cc: xen-devel@lists.xensource.com
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      7f7ace0c
  2. 26 12月, 2008 1 次提交
  3. 18 12月, 2008 1 次提交
  4. 13 12月, 2008 1 次提交
  5. 08 12月, 2008 1 次提交
    • Y
      sparse irq_desc[] array: core kernel and x86 changes · 0b8f1efa
      Yinghai Lu 提交于
      Impact: new feature
      
      Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with
      NR_CPUS set to large values. The goal is to be able to scale up to much
      larger NR_IRQS value without impacting the (important) common case.
      
      To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of
      irq_desc pointers.
      
      When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc,
      this also makes the IRQ descriptors NUMA-local (to the site that calls
      request_irq()).
      
      This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now
      uses desc->chip_data for x86 to store irq_cfg.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0b8f1efa
  6. 24 10月, 2008 1 次提交
  7. 22 10月, 2008 1 次提交
  8. 16 10月, 2008 4 次提交
  9. 25 8月, 2008 1 次提交
    • A
      xen: implement CPU hotplugging · d68d82af
      Alex Nixon 提交于
      Note the changes from 2.6.18-xen CPU hotplugging:
      
      A vcpu_down request from the remote admin via Xenbus both hotunplugs the
      CPU, and disables it by removing it from the cpu_present map, and removing
      its entry in /sys.
      
      A vcpu_up request from the remote admin only re-enables the CPU, and does
      not immediately bring the CPU up. A udev event is emitted, which can be
      caught by the user if he wishes to automatically re-up CPUs when available,
      or implement a more complex policy.
      Signed-off-by: NAlex Nixon <alex.nixon@citrix.com>
      Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d68d82af
  10. 21 8月, 2008 1 次提交
    • J
      xen: save previous spinlock when blocking · 168d2f46
      Jeremy Fitzhardinge 提交于
      A spinlock can be interrupted while spinning, so make sure we preserve
      the previous lock of interest if we're taking a lock from within an
      interrupt handler.
      
      We also need to deal with the case where the blocking path gets
      interrupted between testing to see if the lock is free and actually
      blocking.  If we get interrupted there and end up in the state where
      the lock is free but the irq isn't pending, then we'll block
      indefinitely in the hypervisor.  This fix is to make sure that any
      nested lock-takers will always leave the irq pending if there's any
      chance the outer lock became free.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      168d2f46
  11. 31 7月, 2008 1 次提交
  12. 16 7月, 2008 1 次提交
    • J
      xen: implement Xen-specific spinlocks · 2d9e1e2f
      Jeremy Fitzhardinge 提交于
      The standard ticket spinlocks are very expensive in a virtual
      environment, because their performance depends on Xen's scheduler
      giving vcpus time in the order that they're supposed to take the
      spinlock.
      
      This implements a Xen-specific spinlock, which should be much more
      efficient.
      
      The fast-path is essentially the old Linux-x86 locks, using a single
      lock byte.  The locker decrements the byte; if the result is 0, then
      they have the lock.  If the lock is negative, then locker must spin
      until the lock is positive again.
      
      When there's contention, the locker spin for 2^16[*] iterations waiting
      to get the lock.  If it fails to get the lock in that time, it adds
      itself to the contention count in the lock and blocks on a per-cpu
      event channel.
      
      When unlocking the spinlock, the locker looks to see if there's anyone
      blocked waiting for the lock by checking for a non-zero waiter count.
      If there's a waiter, it traverses the per-cpu "lock_spinners"
      variable, which contains which lock each CPU is waiting on.  It picks
      one CPU waiting on the lock and sends it an event to wake it up.
      
      This allows efficient fast-path spinlock operation, while allowing
      spinning vcpus to give up their processor time while waiting for a
      contended lock.
      
      [*] 2^16 iterations is threshold at which 98% locks have been taken
      according to Thomas Friebel's Xen Summit talk "Preventing Guests from
      Spinning Around".  Therefore, we'd expect the lock and unlock slow
      paths will only be entered 2% of the time.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Lameter <clameter@linux-foundation.org>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Virtualization <virtualization@lists.linux-foundation.org>
      Cc: Xen devel <xen-devel@lists.xensource.com>
      Cc: Thomas Friebel <thomas.friebel@amd.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d9e1e2f
  13. 20 6月, 2008 2 次提交
  14. 27 5月, 2008 3 次提交
    • J
      xen: implement save/restore · 0e91398f
      Jeremy Fitzhardinge 提交于
      This patch implements Xen save/restore and migration.
      
      Saving is triggered via xenbus, which is polled in
      drivers/xen/manage.c.  When a suspend request comes in, the kernel
      prepares itself for saving by:
      
      1 - Freeze all processes.  This is primarily to prevent any
          partially-completed pagetable updates from confusing the suspend
          process.  If CONFIG_PREEMPT isn't defined, then this isn't necessary.
      
      2 - Suspend xenbus and other devices
      
      3 - Stop_machine, to make sure all the other vcpus are quiescent.  The
          Xen tools require the domain to run its save off vcpu0.
      
      4 - Within the stop_machine state, it pins any unpinned pgds (under
          construction or destruction), performs canonicalizes various other
          pieces of state (mostly converting mfns to pfns), and finally
      
      5 - Suspend the domain
      
      Restore reverses the steps used to save the domain, ending when all
      the frozen processes are thawed.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      0e91398f
    • J
      xen: fix unbind_from_irq() · 0f2287ad
      Jeremy Fitzhardinge 提交于
      Rearrange the tests in unbind_from_irq() so that we can still unbind
      an irq even if the underlying event channel is bad.  This allows a
      device driver to shuffle its irqs on save/restore before the
      underlying event channels have been fixed up.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      0f2287ad
    • J
      xen: add rebind_evtchn_irq · eb1e305f
      Jeremy Fitzhardinge 提交于
      Add rebind_evtchn_irq(), which will rebind an device driver's existing
      irq to a new event channel on restore.  Since the new event channel
      will be masked and bound to vcpu0, we update the state accordingly and
      unmask the irq once everything is set up.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      eb1e305f
  15. 25 4月, 2008 6 次提交
  16. 30 1月, 2008 2 次提交
  17. 24 10月, 2007 1 次提交
  18. 11 10月, 2007 1 次提交
  19. 20 7月, 2007 1 次提交
  20. 18 7月, 2007 3 次提交
    • J
      xen: use the hvc console infrastructure for Xen console · b536b4b9
      Jeremy Fitzhardinge 提交于
      Implement a Xen back-end for hvc console.
      
      * * *
      Add early printk support via hvc console, enable using
      "earlyprintk=xen" on the kernel command line.
      
      From: Gerd Hoffmann <kraxel@suse.de>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NOlof Johansson <olof@lixom.net>
      b536b4b9
    • J
      xen: SMP guest support · f87e4cac
      Jeremy Fitzhardinge 提交于
      This is a fairly straightforward Xen implementation of smp_ops.
      
      Xen has its own IPI mechanisms, and has no dependency on any
      APIC-based IPI.  The smp_ops hooks and the flush_tlb_others pv_op
      allow a Xen guest to avoid all APIC code in arch/i386 (the only apic
      operation is a single apic_read for the apic version number).
      
      One subtle point which needs to be addressed is unpinning pagetables
      when another cpu may have a lazy tlb reference to the pagetable. Xen
      will not allow an in-use pagetable to be unpinned, so we must find any
      other cpus with a reference to the pagetable and get them to shoot
      down their references.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Andi Kleen <ak@suse.de>
      f87e4cac
    • J
      xen: event channels · e46cdb66
      Jeremy Fitzhardinge 提交于
      Xen implements interrupts in terms of event channels.  Each guest
      domain gets 1024 event channels which can be used for a variety of
      purposes, such as Xen timer events, inter-domain events,
      inter-processor events (IPI) or for real hardware IRQs.
      
      Within the kernel, we map the event channels to IRQs, and implement
      the whole interrupt handling using a Xen irq_chip.
      
      Rather than setting NR_IRQ to 1024 under PARAVIRT in order to
      accomodate Xen, we create a dynamic mapping between event channels and
      IRQs.  Ideally, Linux will eventually move towards dynamically
      allocating per-irq structures, and we can use a 1:1 mapping between
      event channels and irqs.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      e46cdb66