1. 20 9月, 2011 1 次提交
    • A
      powerpc/xics: Add __init to marker icp_native_init() · cf01a404
      Arnaud Lacombe 提交于
      This should fix the following warning:
      
       LD      arch/powerpc/sysdev/xics/built-in.o
      WARNING: arch/powerpc/sysdev/xics/built-in.o(.text+0x1310): Section mismatch in
      reference from the function .icp_native_init() to the function
      .init.text:.icp_native_init_one_node()
      The function .icp_native_init() references
      the function __init .icp_native_init_one_node().
      This is often because .icp_native_init lacks a __init
      annotation or the annotation of .icp_native_init_one_node is wrong.
      
      icp_native_init() is only referenced in `arch/powerpc/sysdev/xics/xics-common.c'
      by xics_init() which is itself marked with __init.
      
      = not built-tested =
      Reported-by: NTimur Tabi <timur@freescale.com>
      Signed-off-by: NArnaud Lacombe <lacombar@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cf01a404
  2. 12 7月, 2011 1 次提交
    • P
      KVM: PPC: Allow book3s_hv guests to use SMT processor modes · 371fefd6
      Paul Mackerras 提交于
      This lifts the restriction that book3s_hv guests can only run one
      hardware thread per core, and allows them to use up to 4 threads
      per core on POWER7.  The host still has to run single-threaded.
      
      This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
      capability.  The return value of the ioctl querying this capability
      is the number of vcpus per virtual CPU core (vcore), currently 4.
      
      To use this, the host kernel should be booted with all threads
      active, and then all the secondary threads should be offlined.
      This will put the secondary threads into nap mode.  KVM will then
      wake them from nap mode and use them for running guest code (while
      they are still offline).  To wake the secondary threads, we send
      them an IPI using a new xics_wake_cpu() function, implemented in
      arch/powerpc/sysdev/xics/icp-native.c.  In other words, at this stage
      we assume that the platform has a XICS interrupt controller and
      we are using icp-native.c to drive it.  Since the woken thread will
      need to acknowledge and clear the IPI, we also export the base
      physical address of the XICS registers using kvmppc_set_xics_phys()
      for use in the low-level KVM book3s code.
      
      When a vcpu is created, it is assigned to a virtual CPU core.
      The vcore number is obtained by dividing the vcpu number by the
      number of threads per core in the host.  This number is exported
      to userspace via the KVM_CAP_PPC_SMT capability.  If qemu wishes
      to run the guest in single-threaded mode, it should make all vcpu
      numbers be multiples of the number of threads per core.
      
      We distinguish three states of a vcpu: runnable (i.e., ready to execute
      the guest), blocked (that is, idle), and busy in host.  We currently
      implement a policy that the vcore can run only when all its threads
      are runnable or blocked.  This way, if a vcpu needs to execute elsewhere
      in the kernel or in qemu, it can do so without being starved of CPU
      by the other vcpus.
      
      When a vcore starts to run, it executes in the context of one of the
      vcpu threads.  The other vcpu threads all go to sleep and stay asleep
      until something happens requiring the vcpu thread to return to qemu,
      or to wake up to run the vcore (this can happen when another vcpu
      thread goes from busy in host state to blocked).
      
      It can happen that a vcpu goes from blocked to runnable state (e.g.
      because of an interrupt), and the vcore it belongs to is already
      running.  In that case it can start to run immediately as long as
      the none of the vcpus in the vcore have started to exit the guest.
      We send the next free thread in the vcore an IPI to get it to start
      to execute the guest.  It synchronizes with the other threads via
      the vcore->entry_exit_count field to make sure that it doesn't go
      into the guest if the other vcpus are exiting by the time that it
      is ready to actually enter the guest.
      
      Note that there is no fixed relationship between the hardware thread
      number and the vcpu number.  Hardware threads are assigned to vcpus
      as they become runnable, so we will always use the lower-numbered
      hardware threads in preference to higher-numbered threads if not all
      the vcpus in the vcore are runnable, regardless of which vcpus are
      runnable.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      371fefd6
  3. 10 6月, 2011 1 次提交
  4. 19 5月, 2011 2 次提交
    • M
      powerpc: Consolidate ipi message mux and demux · 23d72bfd
      Milton Miller 提交于
      Consolidate the mux and demux of ipi messages into smp.c and call
      a new smp_ops callback to actually trigger the ipi.
      
      The powerpc architecture code is optimised for having 4 distinct
      ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
      single, scheduler ipi, and enter debugger).  However, several interrupt
      controllers only provide a single software triggered interrupt that
      can be delivered to each cpu.  To resolve this limitation, each smp_ops
      implementation created a per-cpu variable that is manipulated with atomic
      bitops.  Since these lines will be contended they are optimialy marked as
      shared_aligned and take a full cache line for each cpu.  Distro kernels
      may have 2 or 3 of these in their config, each taking per-cpu space
      even though at most one will be in use.
      
      This consolidation removes smp_message_recv and replaces the single call
      actions cases with direct calls from the common message recognition loop.
      The complicated debugger ipi case with its muxed crash handling code is
      moved to debug_ipi_action which is now called from the demux code (instead
      of the multi-message action calling smp_message_recv).
      
      I put a call to reschedule_action to increase the likelyhood of correctly
      merging the anticipated scheduler_ipi() hook coming from the scheduler
      tree; that single required call can be inlined later.
      
      The actual message decode is a copy of the old pseries xics code with its
      memory barriers and cache line spacing, augmented with a per-cpu unsigned
      long based on the book-e doorbell code.  The optional data is set via a
      callback from the implementation and is passed to the new cause-ipi hook
      along with the logical cpu number.  While currently only the doorbell
      implemntation uses this data it should be almost zero cost to retrieve and
      pass it -- it adds a single register load for the argument from the same
      cache line to which we just completed a store and the register is dead
      on return from the call.  I extended the data element from unsigned int
      to unsigned long in case some other code wanted to associate a pointer.
      
      The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
      conditioned on the CPU_DBELL feature.  The ifdef guard could be relaxed
      to CONFIG_SMP but I left it with BOOKE for now.
      
      Also, the doorbell interrupt vector for book-e was not calling irq_enter
      and irq_exit, which throws off cpu accounting and causes code to not
      realize it is running in interrupt context.  Add the missing calls.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      23d72bfd
    • M
      powerpc: Remove checks for MSG_ALL and MSG_ALL_BUT_SELF · f1072939
      Milton Miller 提交于
      Now that smp_ops->smp_message_pass is always called with an (online) cpu
      number for the target remove the checks for MSG_ALL and MSG_ALL_BUT_SELF.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f1072939
  5. 06 5月, 2011 1 次提交
  6. 04 5月, 2011 1 次提交
  7. 20 4月, 2011 1 次提交
    • B
      powerpc/xics: Rewrite XICS driver · 0b05ac6e
      Benjamin Herrenschmidt 提交于
      This is a significant rework of the XICS driver, too significant to
      conveniently break it up into a series of smaller patches to be honest.
      
      The driver is moved to a more generic location to allow new platforms
      to use it, and is broken up into separate ICP and ICS "backends". For
      now we have the native and "hypervisor" ICP backends and one common
      RTAS ICS backend.
      
      The driver supports one ICP backend instanciation, and many ICS ones,
      in order to accomodate future platforms with multiple possibly different
      interrupt "sources" mechanisms.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0b05ac6e