1. 06 12月, 2012 1 次提交
    • P
      KVM: PPC: Book3S HV: Restructure HPT entry creation code · 7ed661bf
      Paul Mackerras 提交于
      This restructures the code that creates HPT (hashed page table)
      entries so that it can be called in situations where we don't have a
      struct vcpu pointer, only a struct kvm pointer.  It also fixes a bug
      where kvmppc_map_vrma() would corrupt the guest R4 value.
      
      Most of the work of kvmppc_virtmode_h_enter is now done by a new
      function, kvmppc_virtmode_do_h_enter, which itself calls another new
      function, kvmppc_do_h_enter, which contains most of the old
      kvmppc_h_enter.  The new kvmppc_do_h_enter takes explicit arguments
      for the place to return the HPTE index, the Linux page tables to use,
      and whether it is being called in real mode, thus removing the need
      for it to have the vcpu as an argument.
      
      Currently kvmppc_map_vrma creates the VRMA (virtual real mode area)
      HPTEs by calling kvmppc_virtmode_h_enter, which is designed primarily
      to handle H_ENTER hcalls from the guest that need to pin a page of
      memory.  Since H_ENTER returns the index of the created HPTE in R4,
      kvmppc_virtmode_h_enter updates the guest R4, corrupting the guest R4
      in the case when it gets called from kvmppc_map_vrma on the first
      VCPU_RUN ioctl.  With this, kvmppc_map_vrma instead calls
      kvmppc_virtmode_do_h_enter with the address of a dummy word as the
      place to store the HPTE index, thus avoiding corrupting the guest R4.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7ed661bf
  2. 31 10月, 2012 1 次提交
  3. 30 10月, 2012 7 次提交
    • P
      KVM: PPC: Book3S HV: Fix thinko in try_lock_hpte() · 8b5869ad
      Paul Mackerras 提交于
      This fixes an error in the inline asm in try_lock_hpte() where we
      were erroneously using a register number as an immediate operand.
      The bug only affects an error path, and in fact the code will still
      work as long as the compiler chooses some register other than r0
      for the "bits" variable.  Nevertheless it should still be fixed.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8b5869ad
    • P
      KVM: PPC: Book3S HV: Fix accounting of stolen time · c7b67670
      Paul Mackerras 提交于
      Currently the code that accounts stolen time tends to overestimate the
      stolen time, and will sometimes report more stolen time in a DTL
      (dispatch trace log) entry than has elapsed since the last DTL entry.
      This can cause guests to underflow the user or system time measured
      for some tasks, leading to ridiculous CPU percentages and total runtimes
      being reported by top and other utilities.
      
      In addition, the current code was designed for the previous policy where
      a vcore would only run when all the vcpus in it were runnable, and so
      only counted stolen time on a per-vcore basis.  Now that a vcore can
      run while some of the vcpus in it are doing other things in the kernel
      (e.g. handling a page fault), we need to count the time when a vcpu task
      is preempted while it is not running as part of a vcore as stolen also.
      
      To do this, we bring back the BUSY_IN_HOST vcpu state and extend the
      vcpu_load/put functions to count preemption time while the vcpu is
      in that state.  Handling the transitions between the RUNNING and
      BUSY_IN_HOST states requires checking and updating two variables
      (accumulated time stolen and time last preempted), so we add a new
      spinlock, vcpu->arch.tbacct_lock.  This protects both the per-vcpu
      stolen/preempt-time variables, and the per-vcore variables while this
      vcpu is running the vcore.
      
      Finally, we now don't count time spent in userspace as stolen time.
      The task could be executing in userspace on behalf of the vcpu, or
      it could be preempted, or the vcpu could be genuinely stopped.  Since
      we have no way of dividing up the time between these cases, we don't
      count any of it as stolen.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c7b67670
    • P
      KVM: PPC: Book3S HV: Run virtual core whenever any vcpus in it can run · 8455d79e
      Paul Mackerras 提交于
      Currently the Book3S HV code implements a policy on multi-threaded
      processors (i.e. POWER7) that requires all of the active vcpus in a
      virtual core to be ready to run before we run the virtual core.
      However, that causes problems on reset, because reset stops all vcpus
      except vcpu 0, and can also reduce throughput since all four threads
      in a virtual core have to wait whenever any one of them hits a
      hypervisor page fault.
      
      This relaxes the policy, allowing the virtual core to run as soon as
      any vcpu in it is runnable.  With this, the KVMPPC_VCPU_STOPPED state
      and the KVMPPC_VCPU_BUSY_IN_HOST state have been combined into a single
      KVMPPC_VCPU_NOTREADY state, since we no longer need to distinguish
      between them.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8455d79e
    • P
      KVM: PPC: Book3S HV: Fixes for late-joining threads · 2f12f034
      Paul Mackerras 提交于
      If a thread in a virtual core becomes runnable while other threads
      in the same virtual core are already running in the guest, it is
      possible for the latecomer to join the others on the core without
      first pulling them all out of the guest.  Currently this only happens
      rarely, when a vcpu is first started.  This fixes some bugs and
      omissions in the code in this case.
      
      First, we need to check for VPA updates for the latecomer and make
      a DTL entry for it.  Secondly, if it comes along while the master
      vcpu is doing a VPA update, we don't need to do anything since the
      master will pick it up in kvmppc_run_core.  To handle this correctly
      we introduce a new vcore state, VCORE_STARTING.  Thirdly, there is
      a race because we currently clear the hardware thread's hwthread_req
      before waiting to see it get to nap.  A latecomer thread could have
      its hwthread_req cleared before it gets to test it, and therefore
      never increment the nap_count, leading to messages about wait_for_nap
      timeouts.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2f12f034
    • P
      KVM: PPC: Book3s HV: Don't access runnable threads list without vcore lock · 913d3ff9
      Paul Mackerras 提交于
      There were a few places where we were traversing the list of runnable
      threads in a virtual core, i.e. vc->runnable_threads, without holding
      the vcore spinlock.  This extends the places where we hold the vcore
      spinlock to cover everywhere that we traverse that list.
      
      Since we possibly need to sleep inside kvmppc_book3s_hv_page_fault,
      this moves the call of it from kvmppc_handle_exit out to
      kvmppc_vcpu_run, where we don't hold the vcore lock.
      
      In kvmppc_vcore_blocked, we don't actually need to check whether
      all vcpus are ceded and don't have any pending exceptions, since the
      caller has already done that.  The caller (kvmppc_run_vcpu) wasn't
      actually checking for pending exceptions, so we add that.
      
      The change of if to while in kvmppc_run_vcpu is to make sure that we
      never call kvmppc_remove_runnable() when the vcore state is RUNNING or
      EXITING.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      913d3ff9
    • P
      KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online · 512691d4
      Paul Mackerras 提交于
      When a Book3S HV KVM guest is running, we need the host to be in
      single-thread mode, that is, all of the cores (or at least all of
      the cores where the KVM guest could run) to be running only one
      active hardware thread.  This is because of the hardware restriction
      in POWER processors that all of the hardware threads in the core
      must be in the same logical partition.  Complying with this restriction
      is much easier if, from the host kernel's point of view, only one
      hardware thread is active.
      
      This adds two hooks in the SMP hotplug code to allow the KVM code to
      make sure that secondary threads (i.e. hardware threads other than
      thread 0) cannot come online while any KVM guest exists.  The KVM
      code still has to check that any core where it runs a guest has the
      secondary threads offline, but having done that check it can now be
      sure that they will not come online while the guest is running.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      512691d4
    • A
      PPC: ePAPR: Convert header to uapi · c99ec973
      Alexander Graf 提交于
      The new uapi framework splits kernel internal and user space exported
      bits of header files more cleanly. Adjust the ePAPR header accordingly.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c99ec973
  4. 18 10月, 2012 2 次提交
  5. 09 10月, 2012 3 次提交
  6. 06 10月, 2012 20 次提交
  7. 04 10月, 2012 1 次提交
  8. 03 10月, 2012 3 次提交
  9. 01 10月, 2012 2 次提交
    • A
      sanitize tsk_is_polling() · 16a80163
      Al Viro 提交于
      Make default just return 0.  The current default (checking
      TIF_POLLING_NRFLAG) is taken to architectures that need it;
      ones that don't do polling in their idle threads don't need
      to defined TIF_POLLING_NRFLAG at all.
      
      ia64 defined both TS_POLLING (used by its tsk_is_polling())
      and TIF_POLLING_NRFLAG (not used at all).  Killed the latter...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      16a80163
    • A
      powerpc: switch to generic sys_execve()/kernel_execve() · be6abfa7
      Al Viro 提交于
      the only non-obvious part is that current_pt_regs() is really needed
      here - task_pt_regs() is NULL for kernel threads; it's OK for ptrace
      uses (the thing task_pt_regs() is intended for), but not for us.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      be6abfa7