1. 14 10月, 2015 3 次提交
  2. 30 9月, 2015 1 次提交
  3. 03 8月, 2015 1 次提交
  4. 22 7月, 2015 1 次提交
    • H
      s390/kernel: lazy restore fpu registers · 9977e886
      Hendrik Brueckner 提交于
      Improve the save and restore behavior of FPU register contents to use the
      vector extension within the kernel.
      
      The kernel does not use floating-point or vector registers and, therefore,
      saving and restoring the FPU register contents are performed for handling
      signals or switching processes only.  To prepare for using vector
      instructions and vector registers within the kernel, enhance the save
      behavior and implement a lazy restore at return to user space from a
      system call or interrupt.
      
      To implement the lazy restore, the save_fpu_regs() sets a CPU information
      flag, CIF_FPU, to indicate that the FPU registers must be restored.
      Saving and setting CIF_FPU is performed in an atomic fashion to be
      interrupt-safe.  When the kernel wants to use the vector extension or
      wants to change the FPU register state for a task during signal handling,
      the save_fpu_regs() must be called first.  The CIF_FPU flag is also set at
      process switch.  At return to user space, the FPU state is restored.  In
      particular, the FPU state includes the floating-point or vector register
      contents, as well as, vector-enablement and floating-point control.  The
      FPU state restore and clearing CIF_FPU is also performed in an atomic
      fashion.
      
      For KVM, the restore of the FPU register state is performed when restoring
      the general-purpose guest registers before the SIE instructions is started.
      Because the path towards the SIE instruction is interruptible, the CIF_FPU
      flag must be checked again right before going into SIE.  If set, the guest
      registers must be reloaded again by re-entering the outer SIE loop.  This
      is the same behavior as if the SIE critical section is interrupted.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      9977e886
  5. 20 7月, 2015 1 次提交
  6. 13 4月, 2015 1 次提交
  7. 25 3月, 2015 1 次提交
    • H
      s390: remove 31 bit support · 5a79859a
      Heiko Carstens 提交于
      Remove the 31 bit support in order to reduce maintenance cost and
      effectively remove dead code. Since a couple of years there is no
      distribution left that comes with a 31 bit kernel.
      
      The 31 bit kernel also has been broken since more than a year before
      anybody noticed. In addition I added a removal warning to the kernel
      shown at ipl for 5 minutes: a960062e ("s390: add 31 bit warning
      message") which let everybody know about the plan to remove 31 bit
      code. We didn't get any response.
      
      Given that the last 31 bit only machine was introduced in 1999 let's
      remove the code.
      Anybody with 31 bit user space code can still use the compat mode.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      5a79859a
  8. 06 3月, 2015 1 次提交
  9. 03 11月, 2014 1 次提交
    • M
      s390/cmpxchg: use compiler builtins · f318a122
      Martin Schwidefsky 提交于
      The kernel build for s390 fails for gcc compilers with version 3.x,
      set the minimum required version of gcc to version 4.3.
      
      As the atomic builtins are available with all gcc 4.x compilers,
      use the __sync_val_compare_and_swap and __sync_bool_compare_and_swap
      functions to replace the complex macro and inline assembler magic
      in include/asm/cmpxchg.h. The compiler can just-do-it and generates
      better code with the builtins.
      
      While we are at it use __sync_bool_compare_and_swap for the
      _raw_compare_and_swap function in the spinlock code as well.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      f318a122
  10. 27 10月, 2014 1 次提交
    • H
      s390/ftrace,kprobes: allow to patch first instruction · c933146a
      Heiko Carstens 提交于
      If the function tracer is enabled, allow to set kprobes on the first
      instruction of a function (which is the function trace caller):
      
      If no kprobe is set handling of enabling and disabling function tracing
      of a function simply patches the first instruction. Either it is a nop
      (right now it's an unconditional branch, which skips the mcount block),
      or it's a branch to the ftrace_caller() function.
      
      If a kprobe is being placed on a function tracer calling instruction
      we encode if we actually have a nop or branch in the remaining bytes
      after the breakpoint instruction (illegal opcode).
      This is possible, since the size of the instruction used for the nop
      and branch is six bytes, while the size of the breakpoint is only
      two bytes.
      Therefore the first two bytes contain the illegal opcode and the last
      four bytes contain either "0" for nop or "1" for branch. The kprobes
      code will then execute/simulate the correct instruction.
      
      Instruction patching for kprobes and function tracer is always done
      with stop_machine(). Therefore we don't have any races where an
      instruction is patched concurrently on a different cpu.
      Besides that also the program check handler which executes the function
      trace caller instruction won't be executed concurrently to any
      stop_machine() execution.
      
      This allows to keep full fault based kprobes handling which generates
      correct pt_regs contents automatically.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      c933146a
  11. 09 10月, 2014 1 次提交
  12. 09 9月, 2014 1 次提交
  13. 28 5月, 2014 1 次提交
  14. 20 5月, 2014 1 次提交
  15. 22 4月, 2014 3 次提交
  16. 03 4月, 2014 1 次提交
    • H
      s390/uaccess: rework uaccess code - fix locking issues · 457f2180
      Heiko Carstens 提交于
      The current uaccess code uses a page table walk in some circumstances,
      e.g. in case of the in atomic futex operations or if running on old
      hardware which doesn't support the mvcos instruction.
      
      However it turned out that the page table walk code does not correctly
      lock page tables when accessing page table entries.
      In other words: a different cpu may invalidate a page table entry while
      the current cpu inspects the pte. This may lead to random data corruption.
      
      Adding correct locking however isn't trivial for all uaccess operations.
      Especially copy_in_user() is problematic since that requires to hold at
      least two locks, but must be protected against ABBA deadlock when a
      different cpu also performs a copy_in_user() operation.
      
      So the solution is a different approach where we change address spaces:
      
      User space runs in primary address mode, or access register mode within
      vdso code, like it currently already does.
      
      The kernel usually also runs in home space mode, however when accessing
      user space the kernel switches to primary or secondary address mode if
      the mvcos instruction is not available or if a compare-and-swap (futex)
      instruction on a user space address is performed.
      KVM however is special, since that requires the kernel to run in home
      address space while implicitly accessing user space with the sie
      instruction.
      
      So we end up with:
      
      User space:
      - runs in primary or access register mode
      - cr1 contains the user asce
      - cr7 contains the user asce
      - cr13 contains the kernel asce
      
      Kernel space:
      - runs in home space mode
      - cr1 contains the user or kernel asce
        -> the kernel asce is loaded when a uaccess requires primary or
           secondary address mode
      - cr7 contains the user or kernel asce, (changed with set_fs())
      - cr13 contains the kernel asce
      
      In case of uaccess the kernel changes to:
      - primary space mode in case of a uaccess (copy_to_user) and uses
        e.g. the mvcp instruction to access user space. However the kernel
        will stay in home space mode if the mvcos instruction is available
      - secondary space mode in case of futex atomic operations, so that the
        instructions come from primary address space and data from secondary
        space
      
      In case of kvm the kernel runs in home space mode, but cr1 gets switched
      to contain the gmap asce before the sie instruction gets executed. When
      the sie instruction is finished cr1 will be switched back to contain the
      user asce.
      
      A context switch between two processes will always load the kernel asce
      for the next process in cr1. So the first exit to user space is a bit
      more expensive (one extra load control register instruction) than before,
      however keeps the code rather simple.
      
      In sum this means there is no need to perform any error prone page table
      walks anymore when accessing user space.
      
      The patch seems to be rather large, however it mainly removes the
      the page table walk code and restores the previously deleted "standard"
      uaccess code, with a couple of changes.
      
      The uaccess without mvcos mode can be enforced with the "uaccess_primary"
      kernel parameter.
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457f2180
  17. 02 12月, 2013 1 次提交
  18. 25 11月, 2013 1 次提交
  19. 27 6月, 2013 1 次提交
  20. 21 5月, 2013 2 次提交
  21. 26 4月, 2013 1 次提交
    • M
      s390: system call path micro optimization · 61649881
      Martin Schwidefsky 提交于
      Add a pointer to the system call table to the thread_info structure.
      The TIF_31BIT bit is set or cleared by SET_PERSONALITY exactly once
      for the lifetime of a process. With the pointer to the correct system
      call table in thread_info the system call code in entry64.S path can
      drop the check for TIF_31BIT which saves a couple of instructions.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      61649881
  22. 26 9月, 2012 1 次提交
  23. 20 7月, 2012 1 次提交
  24. 14 6月, 2012 1 次提交
    • H
      s390/smp: make absolute lowcore / cpu restart parameter accesses more robust · fbe76568
      Heiko Carstens 提交于
      Setting the cpu restart parameters is done in three different fashions:
      - directly setting the four parameters individually
      - copying the four parameters with memcpy (using 4 * sizeof(long))
      - copying the four parameters using a private structure
      
      In addition code in entry*.S relies on a certain order of the restart
      members of struct _lowcore.
      
      Make all of this more robust to future changes by adding a
      mem_absolute_assign(dest, val) define, which assigns val to dest
      using absolute addressing mode. Also the load multiple instructions
      in entry*.S have been split into separate load instruction so the
      order of the struct _lowcore members doesn't matter anymore.
      
      In addition move the prototypes of memcpy_real/absolute from uaccess.h
      to processor.h. These memcpy* variants are not related to uaccess at all.
      string.h doesn't seem to match as well, so lets use processor.h.
      
      Also replace the eight byte array in struct _lowcore which represents a
      misaliged u64 with a u64. The compiler will always create code that
      handles the misaligned u64 correctly.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      fbe76568
  25. 29 3月, 2012 1 次提交
  26. 11 3月, 2012 3 次提交
    • M
      [S390] rework idle code · 4c1051e3
      Martin Schwidefsky 提交于
      Whenever the cpu loads an enabled wait PSW it will appear as idle to the
      underlying host system. The code in default_idle calls vtime_stop_cpu
      which does the necessary voodoo to get the cpu time accounting right.
      The udelay code just loads an enabled wait PSW. To correct this rework
      the vtime_stop_cpu/vtime_start_cpu logic and move the difficult parts
      to entry[64].S, vtime_stop_cpu can now be called from anywhere and
      vtime_start_cpu is gone. The correction of the cpu time during wakeup
      from an enabled wait PSW is done with a critical section in entry[64].S.
      As vtime_start_cpu is gone, s390_idle_check can be removed as well.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      4c1051e3
    • M
      [S390] rework smp code · 8b646bd7
      Martin Schwidefsky 提交于
      Define struct pcpu and merge some of the NR_CPUS arrays into it, including
      __cpu_logical_map, current_set and smp_cpu_state. Split smp related
      functions to those operating on physical cpus and the functions operating
      on a logical cpu number. Make the functions for physical cpus use a
      pointer to a struct pcpu. This hides the knowledge about cpu addresses in
      smp.c, entry[64].S and swsusp_asm64.S, thus remove the sigp.h header.
      
      The PSW restart mechanism is used to start secondary cpus, calling a
      function on an online cpu, calling a function on the ipl cpu, and for
      the nmi signal. Replace the different assembler functions with a
      single function restart_int_handler. The new entry point calls a function
      whose pointer is stored in the lowcore of the target cpu and it can wait
      for the source cpu to stop. This covers all existing use cases.
      
      Overall the code is now simpler and there are ~380 lines less code.
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      8b646bd7
    • M
      [S390] rename lowcore field · 7e180bd8
      Martin Schwidefsky 提交于
      The 16 bit value at the lowcore location with offset 0x84 is the
      cpu address that is associated with an external interrupt. Rename
      the field from cpu_addr to ext_cpu_addr to make that clear.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      7e180bd8
  27. 27 12月, 2011 3 次提交
    • M
      [S390] cleanup trap handling · aa33c8cb
      Martin Schwidefsky 提交于
      Move the program interruption code and the translation exception identifier
      to the pt_regs structure as 'int_code' and 'int_parm_long' and make the
      first level interrupt handler in entry[64].S store the two values. That
      makes it possible to drop 'prot_addr' and 'trap_no' from the thread_struct
      and to reduce the number of arguments to a lot of functions. Finally
      un-inline do_trap. Overall this saves 5812 bytes in the .text section of
      the 64 bit kernel.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      aa33c8cb
    • M
      [S390] entry[64].S improvements · c5328901
      Martin Schwidefsky 提交于
      Another round of cleanup for entry[64].S, in particular the program check
      handler looks more reasonable now. The code size for the 31 bit kernel
      has been reduced by 616 byte and by 528 byte for the 64 bit version.
      Even better the code is a bit faster as well.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      c5328901
    • M
      [S390] kvm: move cmf host id constant out of lowcore · ddd6f953
      Martin Schwidefsky 提交于
      There is no reason for the cpu-measurement-facility host id constant to
      reside in the lowcore where space is precious. Use an entry in the literal
      pool in HANDLE_SIE_INTERCEPT and a stack slot in sie64a.
      While we are at it replace the id -1 with 0 to indicate host execution.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      ddd6f953
  28. 30 10月, 2011 2 次提交
    • M
      [S390] signal race with restarting system calls · 20b40a79
      Martin Schwidefsky 提交于
      For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
      do_signal will prepare the restart of the system call with a rewind of
      the PSW before calling get_signal_to_deliver (where the debugger might
      take control). For A ERESTART_RESTARTBLOCK restarting system call
      do_signal will set -EINTR as return code.
      There are two issues with this approach:
      1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
         ERESTART_RESTARTBLOCK as the rewinding already took place or the
         return code has been changed to -EINTR
      2) if get_signal_to_deliver does not return with a signal to deliver
         the restart via the repeat of the svc instruction is left in place.
         This opens a race if another signal is made pending before the
         system call instruction can be reexecuted. The original system call
         will be restarted even if the second signal would have ended the
         system call with -EINTR.
      
      These two issues can be solved by dropping the early rewind of the
      system call before get_signal_to_deliver has been called and by using
      the TIF_RESTART_SVC magic to do the restart if no signal has to be
      delivered. The only situation where the system call restart via the
      repeat of the svc instruction is appropriate is when a SA_RESTART
      signal is delivered to user space.
      
      Unfortunately this breaks inferior calls by the debugger again. The
      system call number and the length of the system call instruction is
      lost over the inferior call and user space will see ERESTARTNOHAND/
      ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
      new ptrace interface is added to save/restore the system call number
      and system call instruction length.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      20b40a79
    • M
      [S390] lowcore cleanup · 0edc8faa
      Martin Schwidefsky 提交于
      Remove the save_area_64 field from the 0xe00 - 0xf00 area in the lowcore.
      Use a free slot in the save_area array instead.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      0edc8faa
  29. 20 9月, 2011 1 次提交
  30. 03 8月, 2011 1 次提交