1. 20 7月, 2015 1 次提交
  2. 08 5月, 2015 1 次提交
  3. 25 3月, 2015 2 次提交
  4. 08 12月, 2014 1 次提交
  5. 25 9月, 2014 1 次提交
  6. 20 5月, 2014 2 次提交
    • M
      s390: split TIF bits into CIF, PIF and TIF bits · d3a73acb
      Martin Schwidefsky 提交于
      The oi and ni instructions used in entry[64].S to set and clear bits
      in the thread-flags are not guaranteed to be atomic in regard to other
      CPUs. Split the TIF bits into CPU, pt_regs and thread-info specific
      bits. Updates on the TIF bits are done with atomic instructions,
      updates on CPU and pt_regs bits are done with non-atomic instructions.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d3a73acb
    • M
      s390/uaccess: simplify control register updates · beef560b
      Martin Schwidefsky 提交于
      Always switch to the kernel ASCE in switch_mm. Load the secondary
      space ASCE in finish_arch_post_lock_switch after checking that
      any pending page table operations have completed. The primary
      ASCE is loaded in entry[64].S. With this the update_primary_asce
      call can be removed from the switch_to macro and from the start
      of switch_mm function. Remove the load_primary argument from
      update_user_asce/clear_user_asce, rename update_user_asce to
      set_user_asce and rename update_primary_asce to load_kernel_asce.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      beef560b
  7. 22 4月, 2014 1 次提交
  8. 03 4月, 2014 1 次提交
    • H
      s390/uaccess: rework uaccess code - fix locking issues · 457f2180
      Heiko Carstens 提交于
      The current uaccess code uses a page table walk in some circumstances,
      e.g. in case of the in atomic futex operations or if running on old
      hardware which doesn't support the mvcos instruction.
      
      However it turned out that the page table walk code does not correctly
      lock page tables when accessing page table entries.
      In other words: a different cpu may invalidate a page table entry while
      the current cpu inspects the pte. This may lead to random data corruption.
      
      Adding correct locking however isn't trivial for all uaccess operations.
      Especially copy_in_user() is problematic since that requires to hold at
      least two locks, but must be protected against ABBA deadlock when a
      different cpu also performs a copy_in_user() operation.
      
      So the solution is a different approach where we change address spaces:
      
      User space runs in primary address mode, or access register mode within
      vdso code, like it currently already does.
      
      The kernel usually also runs in home space mode, however when accessing
      user space the kernel switches to primary or secondary address mode if
      the mvcos instruction is not available or if a compare-and-swap (futex)
      instruction on a user space address is performed.
      KVM however is special, since that requires the kernel to run in home
      address space while implicitly accessing user space with the sie
      instruction.
      
      So we end up with:
      
      User space:
      - runs in primary or access register mode
      - cr1 contains the user asce
      - cr7 contains the user asce
      - cr13 contains the kernel asce
      
      Kernel space:
      - runs in home space mode
      - cr1 contains the user or kernel asce
        -> the kernel asce is loaded when a uaccess requires primary or
           secondary address mode
      - cr7 contains the user or kernel asce, (changed with set_fs())
      - cr13 contains the kernel asce
      
      In case of uaccess the kernel changes to:
      - primary space mode in case of a uaccess (copy_to_user) and uses
        e.g. the mvcp instruction to access user space. However the kernel
        will stay in home space mode if the mvcos instruction is available
      - secondary space mode in case of futex atomic operations, so that the
        instructions come from primary address space and data from secondary
        space
      
      In case of kvm the kernel runs in home space mode, but cr1 gets switched
      to contain the gmap asce before the sie instruction gets executed. When
      the sie instruction is finished cr1 will be switched back to contain the
      user asce.
      
      A context switch between two processes will always load the kernel asce
      for the next process in cr1. So the first exit to user space is a bit
      more expensive (one extra load control register instruction) than before,
      however keeps the code rather simple.
      
      In sum this means there is no need to perform any error prone page table
      walks anymore when accessing user space.
      
      The patch seems to be rather large, however it mainly removes the
      the page table walk code and restores the previously deleted "standard"
      uaccess code, with a couple of changes.
      
      The uaccess without mvcos mode can be enforced with the "uaccess_primary"
      kernel parameter.
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457f2180
  9. 21 2月, 2014 1 次提交
    • M
      s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries · 53e857f3
      Martin Schwidefsky 提交于
      Git commit 050eef36 "[S390] fix tlb flushing vs. concurrent
      /proc accesses" introduced the attach counter to avoid using the
      mm_users value to decide between IPTE for every PTE and lazy TLB
      flushing with IDTE. That fixed the problem with mm_users but it
      introduced another subtle race, fortunately one that is very hard
      to hit.
      The background is the requirement of the architecture that a valid
      PTE may not be changed while it can be used concurrently by another
      cpu. The decision between IPTE and lazy TLB flushing needs to be
      done while the PTE is still valid. Now if the virtual cpu is
      temporarily stopped after the decision to use lazy TLB flushing but
      before the invalid bit of the PTE has been set, another cpu can attach
      the mm, find that flush_mm is set, do the IDTE, return to userspace,
      and recreate a TLB that uses the PTE in question. When the first,
      stopped cpu continues it will change the PTE while it is attached on
      another cpu. The first cpu will do another IDTE shortly after the
      modification of the PTE which makes the race window quite short.
      
      To fix this race the CPU that wants to attach the address space of a
      user space thread needs to wait for the end of the PTE modification.
      The number of concurrent TLB flushers for an mm is tracked in the
      upper 16 bits of the attach_count and finish_arch_post_lock_switch
      is used to wait for the end of the flush operation if required.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      53e857f3
  10. 16 1月, 2014 1 次提交
  11. 16 12月, 2013 1 次提交
  12. 30 9月, 2013 1 次提交
  13. 28 8月, 2013 1 次提交
  14. 22 8月, 2013 1 次提交
    • M
      s390: convert interrupt handling to use generic hardirq · 1f44a225
      Martin Schwidefsky 提交于
      With the introduction of PCI it became apparent that s390 should
      convert to generic hardirqs as too many drivers do not have the
      correct dependency for GENERIC_HARDIRQS. On the architecture
      level s390 does not have irq lines. It has external interrupts,
      I/O interrupts and adapter interrupts. This patch hard-codes all
      external interrupts as irq #1, all I/O interrupts as irq #2 and
      all adapter interrupts as irq #3. The additional information from
      the lowcore associated with the interrupt is stored in the
      pt_regs of the interrupt frame, where the interrupt handler can
      pick it up. For PCI/MSI interrupts the adapter interrupt handler
      scans the relevant bit fields and calls generic_handle_irq with
      the virtual irq number for the MSI interrupt.
      Reviewed-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1f44a225
  15. 27 6月, 2013 1 次提交
  16. 17 6月, 2013 1 次提交
  17. 21 5月, 2013 3 次提交
  18. 26 4月, 2013 2 次提交
  19. 05 3月, 2013 1 次提交
    • M
      s390: critical section cleanup vs. machine checks · 6551fbdf
      Martin Schwidefsky 提交于
      The current machine check code uses the registers stored by the machine
      in the lowcore at __LC_GPREGS_SAVE_AREA as the registers of the interrupted
      context. The registers 0-7 of a user process can get clobbered if a machine
      checks interrupts the execution of a critical section in entry[64].S.
      
      The reason is that the critical section cleanup code may need to modify
      the PSW and the registers for the previous context to get to the end of a
      critical section. If registers 0-7 have to be replaced the relevant copy
      will be in the registers, which invalidates the copy in the lowcore. The
      machine check handler needs to explicitly store registers 0-7 to the stack.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      6551fbdf
  20. 14 2月, 2013 1 次提交
  21. 23 11月, 2012 3 次提交
    • C
      s390/kvm: Fix address space mixup · ce6a04ac
      Christian Borntraeger 提交于
      I was chasing down a bug of random validity intercepts on s390.
      (guest prefix page not mapped in the host virtual aspace). Turns out
      that the problem was a wrong address space control element. The
      cause was quite complex:
      
      During paging activity a DAT protection during SIE caused a program
      interrupt. Normally, the sie retry loop tries to catch all
      interrupts during and shortly before sie to rerun the setup. The
      problem is now that protection causes a suppressing program interrupt,
      causing the PSW to point to the instruction AFTER SIE in case of DAT
      protection. This confused the logic of the retry loop to not trigger,
      instead we jumped directly back to SIE after return from
      the program  interrupt. (the protection fault handler itself did
      a rewind of the psw). This usually works quite well, but:
      
      If now the protection fault handler has to wait, another program
      might be scheduled in. Later on the sie process will be schedules
      in again. In that case the content of CR1 (primary address space)
      will be wrong because switch_to will put the user space ASCE into CR1
      and not the guest ASCE.
      
      In addition the program parameter is also wrong for every protection
      fault of a guest, since we dont issue the SPP instruction.
      
      So lets also check for PSW == instruction after SIE in the program
      check handler. Instead of expensively checking all program
      interruption codes that might be suppressing we assume that a program
      interrupt pointing after SIE was always a program interrupt in SIE.
      (Otherwise we have a kernel bug anyway).
      
      We also have to compensate the rewinding, since the C-level handlers
      will do that. Therefore we need to add a nop with the same length
      as SIE before the sie_loop.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      CC: stable@vger.kernel.org
      CC: Heiko Carstens <heiko.carstens@de.ibm.com>
      ce6a04ac
    • M
      s390/ptrace: race of single stepping vs signal delivery · 39efd4ec
      Martin Schwidefsky 提交于
      The current single step code is racy in regard to concurrent delivery
      of signals. If a signal is delivered after a PER program check occurred
      but before the TIF_PER_TRAP bit has been checked in entry[64].S the code
      clears TIF_PER_TRAP and then calls do_signal. This is wrong, if the
      instruction completed (or has been suppressed) a SIGTRAP should be
      delivered to the debugger in any case. Only if the instruction has been
      nullified the SIGTRAP may not be send.
      
      The new logic always sets TIF_PER_TRAP if the program check indicates PER
      tracing but removes it again for all program checks that are nullifying.
      The effect is that for each change in the PSW address we now get a
      single SIGTRAP.
      Reported-by: NAndreas Arnez <arnez@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      39efd4ec
    • H
      s390/traps: preinitialize program check table · b01a37a7
      Heiko Carstens 提交于
      Preinitialize the program check table, so we can put it into the
      read-only data section.
      Also use only four byte entries for the table, since each program
      check handler resides within the first 2GB. Therefore this reduces
      the size of the table by 50% on 64 bit builds.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b01a37a7
  22. 29 10月, 2012 1 次提交
  23. 09 10月, 2012 1 次提交
  24. 01 10月, 2012 3 次提交
  25. 26 9月, 2012 2 次提交
  26. 20 7月, 2012 2 次提交
    • M
      s390/vtimer: rework virtual timer interface · 27f6b416
      Martin Schwidefsky 提交于
      The current virtual timer interface is inherently per-cpu and hard to
      use. The sole user of the interface is appldata which uses it to execute
      a function after a specific amount of cputime has been used over all cpus.
      
      Rework the virtual timer interface to hook into the cputime accounting.
      This makes the interface independent from the CPU timer interrupts, and
      makes the virtual timers global as opposed to per-cpu.
      Overall the code is greatly simplified. The downside is that the accuracy
      is not as good as the original implementation, but it is still good enough
      for appldata.
      Reviewed-by: NJan Glauber <jang@linux.vnet.ibm.com>
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      27f6b416
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  27. 14 6月, 2012 1 次提交
    • H
      s390/smp: make absolute lowcore / cpu restart parameter accesses more robust · fbe76568
      Heiko Carstens 提交于
      Setting the cpu restart parameters is done in three different fashions:
      - directly setting the four parameters individually
      - copying the four parameters with memcpy (using 4 * sizeof(long))
      - copying the four parameters using a private structure
      
      In addition code in entry*.S relies on a certain order of the restart
      members of struct _lowcore.
      
      Make all of this more robust to future changes by adding a
      mem_absolute_assign(dest, val) define, which assigns val to dest
      using absolute addressing mode. Also the load multiple instructions
      in entry*.S have been split into separate load instruction so the
      order of the struct _lowcore members doesn't matter anymore.
      
      In addition move the prototypes of memcpy_real/absolute from uaccess.h
      to processor.h. These memcpy* variants are not related to uaccess at all.
      string.h doesn't seem to match as well, so lets use processor.h.
      
      Also replace the eight byte array in struct _lowcore which represents a
      misaliged u64 with a u64. The compiler will always create code that
      handles the misaligned u64 correctly.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      fbe76568
  28. 05 6月, 2012 2 次提交