1. 11 11月, 2016 2 次提交
  2. 08 8月, 2016 1 次提交
  3. 04 7月, 2016 1 次提交
    • H
      s390: have unique symbol for __switch_to address · 46210c44
      Heiko Carstens 提交于
      After linking there are several symbols for the same address that the
      __switch_to symbol points to. E.g.:
      
      000000000089b9c0 T __kprobes_text_start
      000000000089b9c0 T __lock_text_end
      000000000089b9c0 T __lock_text_start
      000000000089b9c0 T __sched_text_end
      000000000089b9c0 T __switch_to
      
      When disassembling with "objdump -d" this results in a missing
      __switch_to function. It would be named __kprobes_text_start
      instead. To unconfuse objdump add a nop in front of the kprobes text
      section. That way __switch_to appears again.
      
      Obviously this solution is sort of a hack, since it also depends on
      link order if this works or not. However it is the best I can come up
      with for now.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      46210c44
  4. 28 6月, 2016 1 次提交
  5. 10 3月, 2016 1 次提交
    • M
      s390: fix floating pointer register corruption (again) · e370e476
      Martin Schwidefsky 提交于
      There is a tricky interaction between the machine check handler
      and the critical sections of load_fpu_regs and save_fpu_regs
      functions. If the machine check interrupts one of the two
      functions the critical section cleanup will complete the function
      before the machine check handler s390_do_machine_check is called.
      Trouble is that the machine check handler needs to validate the
      floating point registers *before* and not *after* the completion
      of load_fpu_regs/save_fpu_regs.
      
      The simplest solution is to rewind the PSW to the start of the
      load_fpu_regs/save_fpu_regs and retry the function after the
      return from the machine check handler.
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Cc: <stable@vger.kernel.org> # 4.3+
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      e370e476
  6. 02 3月, 2016 1 次提交
  7. 27 11月, 2015 1 次提交
  8. 14 10月, 2015 5 次提交
  9. 30 9月, 2015 1 次提交
  10. 17 9月, 2015 1 次提交
  11. 03 8月, 2015 2 次提交
  12. 22 7月, 2015 5 次提交
    • M
      s390/nmi: use the normal asynchronous stack for machine checks · 2acb94f4
      Martin Schwidefsky 提交于
      If a machine checks is received while the CPU is in the kernel, only
      the s390_do_machine_check function will be called. The call to
      s390_handle_mcck is postponed until the CPU returns to user space.
      Because of this it is safe to use the asynchronous stack for machine
      checks even if the CPU is already handling an interrupt.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2acb94f4
    • M
      s390/kernel: squeeze a few more cycles out of the system call handler · a359bb11
      Martin Schwidefsky 提交于
      Reorder the instructions of UPDATE_VTIME to improve superscalar execution,
      remove duplicate checks for problem-state from the asynchronous interrupt
      handlers, and move the check for problem-state from the synchronous
      exit path to the program check path as it is only needed for program
      checks inside the kernel.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      a359bb11
    • M
      s390/kvm: integrate HANDLE_SIE_INTERCEPT into cleanup_critical · d0fc4107
      Martin Schwidefsky 提交于
      Currently there are two mechanisms to deal with cleanup work due to
      interrupts. The HANDLE_SIE_INTERCEPT macro is used to undo the changes
      required to enter SIE in sie64a. If the SIE instruction causes a program
      check, or an asynchronous interrupt is received the HANDLE_SIE_INTERCEPT
      code forwards the program execution to sie_exit.
      
      All the other critical sections in entry.S are handled by the code in
      cleanup_critical that is called by the SWITCH_ASYNC macro.
      
      Move the sie64a function to the beginning of the critical section and
      add the code from HANDLE_SIE_INTERCEPT to cleanup_critical. Add a special
      case for the sie64a cleanup to the program check handler.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d0fc4107
    • M
      s390/kvm: fix interrupt race with HANDLE_SIE_INTERCEPT · dcd2a9aa
      Martin Schwidefsky 提交于
      The HANDLE_SIE_INTERCEPT macro is used in the interrupt handlers
      and the program check handler to undo a few changes done by sie64a.
      Among them are guest vs host LPP, the gmap ASCE vs kernel ASCE and
      the bit that indicates that SIE is currently running on the CPU.
      
      There is a race of a voluntary SIE exit vs asynchronous interrupts.
      If the CPU completed the SIE instruction and the TM instruction of
      the LPP macro at the time it receives an interrupt, the interrupt
      handler will run while the LPP, the ASCE and the SIE bit are still
      set up for guest execution. This might result in wrong sampling data,
      but it will not cause data corruption or lockups.
      
      The critical section in sie64a needs to be enlarged to include all
      instructions that undo the changes required for guest execution.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      dcd2a9aa
    • H
      s390/kernel: lazy restore fpu registers · 9977e886
      Hendrik Brueckner 提交于
      Improve the save and restore behavior of FPU register contents to use the
      vector extension within the kernel.
      
      The kernel does not use floating-point or vector registers and, therefore,
      saving and restoring the FPU register contents are performed for handling
      signals or switching processes only.  To prepare for using vector
      instructions and vector registers within the kernel, enhance the save
      behavior and implement a lazy restore at return to user space from a
      system call or interrupt.
      
      To implement the lazy restore, the save_fpu_regs() sets a CPU information
      flag, CIF_FPU, to indicate that the FPU registers must be restored.
      Saving and setting CIF_FPU is performed in an atomic fashion to be
      interrupt-safe.  When the kernel wants to use the vector extension or
      wants to change the FPU register state for a task during signal handling,
      the save_fpu_regs() must be called first.  The CIF_FPU flag is also set at
      process switch.  At return to user space, the FPU state is restored.  In
      particular, the FPU state includes the floating-point or vector register
      contents, as well as, vector-enablement and floating-point control.  The
      FPU state restore and clearing CIF_FPU is also performed in an atomic
      fashion.
      
      For KVM, the restore of the FPU register state is performed when restoring
      the general-purpose guest registers before the SIE instructions is started.
      Because the path towards the SIE instruction is interruptible, the CIF_FPU
      flag must be checked again right before going into SIE.  If set, the guest
      registers must be reloaded again by re-entering the outer SIE loop.  This
      is the same behavior as if the SIE critical section is interrupted.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      9977e886
  13. 20 7月, 2015 1 次提交
  14. 08 5月, 2015 1 次提交
  15. 25 3月, 2015 2 次提交
  16. 08 12月, 2014 1 次提交
  17. 25 9月, 2014 1 次提交
  18. 20 5月, 2014 2 次提交
    • M
      s390: split TIF bits into CIF, PIF and TIF bits · d3a73acb
      Martin Schwidefsky 提交于
      The oi and ni instructions used in entry[64].S to set and clear bits
      in the thread-flags are not guaranteed to be atomic in regard to other
      CPUs. Split the TIF bits into CPU, pt_regs and thread-info specific
      bits. Updates on the TIF bits are done with atomic instructions,
      updates on CPU and pt_regs bits are done with non-atomic instructions.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d3a73acb
    • M
      s390/uaccess: simplify control register updates · beef560b
      Martin Schwidefsky 提交于
      Always switch to the kernel ASCE in switch_mm. Load the secondary
      space ASCE in finish_arch_post_lock_switch after checking that
      any pending page table operations have completed. The primary
      ASCE is loaded in entry[64].S. With this the update_primary_asce
      call can be removed from the switch_to macro and from the start
      of switch_mm function. Remove the load_primary argument from
      update_user_asce/clear_user_asce, rename update_user_asce to
      set_user_asce and rename update_primary_asce to load_kernel_asce.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      beef560b
  19. 22 4月, 2014 1 次提交
  20. 03 4月, 2014 1 次提交
    • H
      s390/uaccess: rework uaccess code - fix locking issues · 457f2180
      Heiko Carstens 提交于
      The current uaccess code uses a page table walk in some circumstances,
      e.g. in case of the in atomic futex operations or if running on old
      hardware which doesn't support the mvcos instruction.
      
      However it turned out that the page table walk code does not correctly
      lock page tables when accessing page table entries.
      In other words: a different cpu may invalidate a page table entry while
      the current cpu inspects the pte. This may lead to random data corruption.
      
      Adding correct locking however isn't trivial for all uaccess operations.
      Especially copy_in_user() is problematic since that requires to hold at
      least two locks, but must be protected against ABBA deadlock when a
      different cpu also performs a copy_in_user() operation.
      
      So the solution is a different approach where we change address spaces:
      
      User space runs in primary address mode, or access register mode within
      vdso code, like it currently already does.
      
      The kernel usually also runs in home space mode, however when accessing
      user space the kernel switches to primary or secondary address mode if
      the mvcos instruction is not available or if a compare-and-swap (futex)
      instruction on a user space address is performed.
      KVM however is special, since that requires the kernel to run in home
      address space while implicitly accessing user space with the sie
      instruction.
      
      So we end up with:
      
      User space:
      - runs in primary or access register mode
      - cr1 contains the user asce
      - cr7 contains the user asce
      - cr13 contains the kernel asce
      
      Kernel space:
      - runs in home space mode
      - cr1 contains the user or kernel asce
        -> the kernel asce is loaded when a uaccess requires primary or
           secondary address mode
      - cr7 contains the user or kernel asce, (changed with set_fs())
      - cr13 contains the kernel asce
      
      In case of uaccess the kernel changes to:
      - primary space mode in case of a uaccess (copy_to_user) and uses
        e.g. the mvcp instruction to access user space. However the kernel
        will stay in home space mode if the mvcos instruction is available
      - secondary space mode in case of futex atomic operations, so that the
        instructions come from primary address space and data from secondary
        space
      
      In case of kvm the kernel runs in home space mode, but cr1 gets switched
      to contain the gmap asce before the sie instruction gets executed. When
      the sie instruction is finished cr1 will be switched back to contain the
      user asce.
      
      A context switch between two processes will always load the kernel asce
      for the next process in cr1. So the first exit to user space is a bit
      more expensive (one extra load control register instruction) than before,
      however keeps the code rather simple.
      
      In sum this means there is no need to perform any error prone page table
      walks anymore when accessing user space.
      
      The patch seems to be rather large, however it mainly removes the
      the page table walk code and restores the previously deleted "standard"
      uaccess code, with a couple of changes.
      
      The uaccess without mvcos mode can be enforced with the "uaccess_primary"
      kernel parameter.
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457f2180
  21. 21 2月, 2014 1 次提交
    • M
      s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries · 53e857f3
      Martin Schwidefsky 提交于
      Git commit 050eef36 "[S390] fix tlb flushing vs. concurrent
      /proc accesses" introduced the attach counter to avoid using the
      mm_users value to decide between IPTE for every PTE and lazy TLB
      flushing with IDTE. That fixed the problem with mm_users but it
      introduced another subtle race, fortunately one that is very hard
      to hit.
      The background is the requirement of the architecture that a valid
      PTE may not be changed while it can be used concurrently by another
      cpu. The decision between IPTE and lazy TLB flushing needs to be
      done while the PTE is still valid. Now if the virtual cpu is
      temporarily stopped after the decision to use lazy TLB flushing but
      before the invalid bit of the PTE has been set, another cpu can attach
      the mm, find that flush_mm is set, do the IDTE, return to userspace,
      and recreate a TLB that uses the PTE in question. When the first,
      stopped cpu continues it will change the PTE while it is attached on
      another cpu. The first cpu will do another IDTE shortly after the
      modification of the PTE which makes the race window quite short.
      
      To fix this race the CPU that wants to attach the address space of a
      user space thread needs to wait for the end of the PTE modification.
      The number of concurrent TLB flushers for an mm is tracked in the
      upper 16 bits of the attach_count and finish_arch_post_lock_switch
      is used to wait for the end of the flush operation if required.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      53e857f3
  22. 16 1月, 2014 1 次提交
  23. 16 12月, 2013 1 次提交
  24. 30 9月, 2013 1 次提交
  25. 28 8月, 2013 1 次提交
  26. 22 8月, 2013 1 次提交
    • M
      s390: convert interrupt handling to use generic hardirq · 1f44a225
      Martin Schwidefsky 提交于
      With the introduction of PCI it became apparent that s390 should
      convert to generic hardirqs as too many drivers do not have the
      correct dependency for GENERIC_HARDIRQS. On the architecture
      level s390 does not have irq lines. It has external interrupts,
      I/O interrupts and adapter interrupts. This patch hard-codes all
      external interrupts as irq #1, all I/O interrupts as irq #2 and
      all adapter interrupts as irq #3. The additional information from
      the lowcore associated with the interrupt is stored in the
      pt_regs of the interrupt frame, where the interrupt handler can
      pick it up. For PCI/MSI interrupts the adapter interrupt handler
      scans the relevant bit fields and calls generic_handle_irq with
      the virtual irq number for the MSI interrupt.
      Reviewed-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1f44a225
  27. 27 6月, 2013 1 次提交
  28. 17 6月, 2013 1 次提交