1. 05 4月, 2017 2 次提交
    • M
      s390/cpumf: simplify detection of guest samples · df26c2e8
      Martin Schwidefsky 提交于
      There are three different code levels in regard to the identification
      of guest samples. They differ in the way the LPP instruction is used.
      
      1) Old kernels without the LPP instruction. The guest program parameter
         is always zero.
      2) Newer kernels load the process pid into the program parameter with LPP.
         The guest program parameter is non-zero if the guest executes in a
         process != idle.
      3) The latest kernels load ((1UL << 31) | pid) with LPP to make the value
         non-zero even for the idle task. The guest program parameter is non-zero
         if the guest is running.
      
      All kernels load the process pid to CR4 on context switch. The CPU sampling
      code uses the value in CR4 to decide between guest and host samples in case
      the guest program parameter is zero. The three cases:
      
      1) CR4==pid, gpp==0
      2) CR4==pid, gpp==pid
      3) CR4==pid, gpp==((1UL << 31) | pid)
      
      The load-control instruction to load the pid into CR4 is expensive and the
      goal is to remove it. To distinguish the host CR4 from the guest pid for
      the idle process the maximum value 0xffff for the PASN is used.
      This adds a fourth case for a guest OS with an updated kernel:
      
      4) CR4==0xffff, gpp=((1UL << 31) | pid)
      
      The host kernel will have CR4==0xffff and will use (gpp!=0 || CR4!==0xffff)
      to identify guest samples. This works nicely with all 4 cases, the only
      possible issue would be a guest with an old kernel (gpp==0) and a process
      pid of 0xffff. Well, don't do that..
      Suggested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      df26c2e8
    • M
      s390: use 64-bit lctlg to load task pid to cr4 on context switch · cab36c26
      Martin Schwidefsky 提交于
      The 32-bit lctl instruction is quite a bit slower than the 64-bit
      counter part lctlg. Use the faster instruction.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      cab36c26
  2. 22 3月, 2017 1 次提交
    • M
      s390: add a system call for guarded storage · 916cda1a
      Martin Schwidefsky 提交于
      This adds a new system call to enable the use of guarded storage for
      user space processes. The system call takes two arguments, a command
      and pointer to a guarded storage control block:
      
          s390_guarded_storage(int command, struct gs_cb *gs_cb);
      
      The second argument is relevant only for the GS_SET_BC_CB command.
      
      The commands in detail:
      
      0 - GS_ENABLE
          Enable the guarded storage facility for the current task. The
          initial content of the guarded storage control block will be
          all zeros. After the enablement the user space code can use
          load-guarded-storage-controls instruction (LGSC) to load an
          arbitrary control block. While a task is enabled the kernel
          will save and restore the current content of the guarded
          storage registers on context switch.
      1 - GS_DISABLE
          Disables the use of the guarded storage facility for the current
          task. The kernel will cease to save and restore the content of
          the guarded storage registers, the task specific content of
          these registers is lost.
      2 - GS_SET_BC_CB
          Set a broadcast guarded storage control block. This is called
          per thread and stores a specific guarded storage control block
          in the task struct of the current task. This control block will
          be used for the broadcast event GS_BROADCAST.
      3 - GS_CLEAR_BC_CB
          Clears the broadcast guarded storage control block. The guarded-
          storage control block is removed from the task struct that was
          established by GS_SET_BC_CB.
      4 - GS_BROADCAST
          Sends a broadcast to all thread siblings of the current task.
          Every sibling that has established a broadcast guarded storage
          control block will load this control block and will be enabled
          for guarded storage. The broadcast guarded storage control block
          is used up, a second broadcast without a refresh of the stored
          control block with GS_SET_BC_CB will not have any effect.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      916cda1a
  3. 01 3月, 2017 1 次提交
  4. 23 2月, 2017 2 次提交
  5. 20 2月, 2017 1 次提交
  6. 08 2月, 2017 1 次提交
    • M
      s390: add no-execute support · 57d7f939
      Martin Schwidefsky 提交于
      Bit 0x100 of a page table, segment table of region table entry
      can be used to disallow code execution for the virtual addresses
      associated with the entry.
      
      There is one tricky bit, the system call to return from a signal
      is part of the signal frame written to the user stack. With a
      non-executable stack this would stop working. To avoid breaking
      things the protection fault handler checks the opcode that caused
      the fault for 0x0a77 (sys_sigreturn) and 0x0aad (sys_rt_sigreturn)
      and injects a system call. This is preferable to the alternative
      solution with a stub function in the vdso because it works for
      vdso=off and statically linked binaries as well.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      57d7f939
  7. 31 1月, 2017 1 次提交
    • M
      s390: store breaking event address only for program checks · 34525e1f
      Martin Schwidefsky 提交于
      The principles of operations specifies that the breaking event address
      is stored to the address 0x110 in the prefix page only for program checks.
      The last branch in user space is lost as soon as a branch in kernel space
      is executed after e.g. an svc. This makes it impossible to accurately
      maintain the breaking event address for a user space process.
      
      Simplify the code, just copy the current breaking event address from
      0x110 to the task structure for program checks from user space.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      34525e1f
  8. 12 12月, 2016 1 次提交
  9. 07 12月, 2016 1 次提交
    • M
      s390: fix machine check panic stack switch · ce4dda3f
      Martin Schwidefsky 提交于
      For system damage machine checks or machine checks due to invalid PSW
      fields the system will be stopped. In order to get an oops message out
      before killing the system the machine check handler branches to
      .Lmcck_panic, switches to the panic stack and then does the usual
      machine check handling.
      
      The switch to the panic stack is incomplete, the stack pointer in %r15
      is replaced, but the pt_regs pointer in %r11 is not. The result is
      a program check which will kill the system in a slightly different way.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      ce4dda3f
  10. 25 11月, 2016 1 次提交
  11. 23 11月, 2016 1 次提交
  12. 15 11月, 2016 1 次提交
  13. 11 11月, 2016 2 次提交
  14. 08 8月, 2016 1 次提交
  15. 04 7月, 2016 1 次提交
    • H
      s390: have unique symbol for __switch_to address · 46210c44
      Heiko Carstens 提交于
      After linking there are several symbols for the same address that the
      __switch_to symbol points to. E.g.:
      
      000000000089b9c0 T __kprobes_text_start
      000000000089b9c0 T __lock_text_end
      000000000089b9c0 T __lock_text_start
      000000000089b9c0 T __sched_text_end
      000000000089b9c0 T __switch_to
      
      When disassembling with "objdump -d" this results in a missing
      __switch_to function. It would be named __kprobes_text_start
      instead. To unconfuse objdump add a nop in front of the kprobes text
      section. That way __switch_to appears again.
      
      Obviously this solution is sort of a hack, since it also depends on
      link order if this works or not. However it is the best I can come up
      with for now.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      46210c44
  16. 28 6月, 2016 1 次提交
  17. 10 3月, 2016 1 次提交
    • M
      s390: fix floating pointer register corruption (again) · e370e476
      Martin Schwidefsky 提交于
      There is a tricky interaction between the machine check handler
      and the critical sections of load_fpu_regs and save_fpu_regs
      functions. If the machine check interrupts one of the two
      functions the critical section cleanup will complete the function
      before the machine check handler s390_do_machine_check is called.
      Trouble is that the machine check handler needs to validate the
      floating point registers *before* and not *after* the completion
      of load_fpu_regs/save_fpu_regs.
      
      The simplest solution is to rewind the PSW to the start of the
      load_fpu_regs/save_fpu_regs and retry the function after the
      return from the machine check handler.
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Cc: <stable@vger.kernel.org> # 4.3+
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      e370e476
  18. 02 3月, 2016 1 次提交
  19. 27 11月, 2015 1 次提交
  20. 14 10月, 2015 5 次提交
  21. 30 9月, 2015 1 次提交
  22. 17 9月, 2015 1 次提交
  23. 03 8月, 2015 2 次提交
  24. 22 7月, 2015 5 次提交
    • M
      s390/nmi: use the normal asynchronous stack for machine checks · 2acb94f4
      Martin Schwidefsky 提交于
      If a machine checks is received while the CPU is in the kernel, only
      the s390_do_machine_check function will be called. The call to
      s390_handle_mcck is postponed until the CPU returns to user space.
      Because of this it is safe to use the asynchronous stack for machine
      checks even if the CPU is already handling an interrupt.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2acb94f4
    • M
      s390/kernel: squeeze a few more cycles out of the system call handler · a359bb11
      Martin Schwidefsky 提交于
      Reorder the instructions of UPDATE_VTIME to improve superscalar execution,
      remove duplicate checks for problem-state from the asynchronous interrupt
      handlers, and move the check for problem-state from the synchronous
      exit path to the program check path as it is only needed for program
      checks inside the kernel.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      a359bb11
    • M
      s390/kvm: integrate HANDLE_SIE_INTERCEPT into cleanup_critical · d0fc4107
      Martin Schwidefsky 提交于
      Currently there are two mechanisms to deal with cleanup work due to
      interrupts. The HANDLE_SIE_INTERCEPT macro is used to undo the changes
      required to enter SIE in sie64a. If the SIE instruction causes a program
      check, or an asynchronous interrupt is received the HANDLE_SIE_INTERCEPT
      code forwards the program execution to sie_exit.
      
      All the other critical sections in entry.S are handled by the code in
      cleanup_critical that is called by the SWITCH_ASYNC macro.
      
      Move the sie64a function to the beginning of the critical section and
      add the code from HANDLE_SIE_INTERCEPT to cleanup_critical. Add a special
      case for the sie64a cleanup to the program check handler.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d0fc4107
    • M
      s390/kvm: fix interrupt race with HANDLE_SIE_INTERCEPT · dcd2a9aa
      Martin Schwidefsky 提交于
      The HANDLE_SIE_INTERCEPT macro is used in the interrupt handlers
      and the program check handler to undo a few changes done by sie64a.
      Among them are guest vs host LPP, the gmap ASCE vs kernel ASCE and
      the bit that indicates that SIE is currently running on the CPU.
      
      There is a race of a voluntary SIE exit vs asynchronous interrupts.
      If the CPU completed the SIE instruction and the TM instruction of
      the LPP macro at the time it receives an interrupt, the interrupt
      handler will run while the LPP, the ASCE and the SIE bit are still
      set up for guest execution. This might result in wrong sampling data,
      but it will not cause data corruption or lockups.
      
      The critical section in sie64a needs to be enlarged to include all
      instructions that undo the changes required for guest execution.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      dcd2a9aa
    • H
      s390/kernel: lazy restore fpu registers · 9977e886
      Hendrik Brueckner 提交于
      Improve the save and restore behavior of FPU register contents to use the
      vector extension within the kernel.
      
      The kernel does not use floating-point or vector registers and, therefore,
      saving and restoring the FPU register contents are performed for handling
      signals or switching processes only.  To prepare for using vector
      instructions and vector registers within the kernel, enhance the save
      behavior and implement a lazy restore at return to user space from a
      system call or interrupt.
      
      To implement the lazy restore, the save_fpu_regs() sets a CPU information
      flag, CIF_FPU, to indicate that the FPU registers must be restored.
      Saving and setting CIF_FPU is performed in an atomic fashion to be
      interrupt-safe.  When the kernel wants to use the vector extension or
      wants to change the FPU register state for a task during signal handling,
      the save_fpu_regs() must be called first.  The CIF_FPU flag is also set at
      process switch.  At return to user space, the FPU state is restored.  In
      particular, the FPU state includes the floating-point or vector register
      contents, as well as, vector-enablement and floating-point control.  The
      FPU state restore and clearing CIF_FPU is also performed in an atomic
      fashion.
      
      For KVM, the restore of the FPU register state is performed when restoring
      the general-purpose guest registers before the SIE instructions is started.
      Because the path towards the SIE instruction is interruptible, the CIF_FPU
      flag must be checked again right before going into SIE.  If set, the guest
      registers must be reloaded again by re-entering the outer SIE loop.  This
      is the same behavior as if the SIE critical section is interrupted.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      9977e886
  25. 20 7月, 2015 1 次提交
  26. 08 5月, 2015 1 次提交
  27. 25 3月, 2015 2 次提交