1. 26 8月, 2014 1 次提交
    • M
      KVM: s390/mm: use radix trees for guest to host mappings · 527e30b4
      Martin Schwidefsky 提交于
      Store the target address for the gmap segments in a radix tree
      instead of using invalid segment table entries. gmap_translate
      becomes a simple radix_tree_lookup, gmap_fault is split into the
      address translation with gmap_translate and the part that does
      the linking of the gmap shadow page table with the process page
      table.
      A second radix tree is used to keep the pointers to the segment
      table entries for segments that are mapped in the guest address
      space. On unmap of a segment the pointer is retrieved from the
      radix tree and is used to carry out the segment invalidation in
      the gmap shadow page table. As the radix tree can only store one
      pointer, each host segment may only be mapped to exactly one
      guest location.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      527e30b4
  2. 25 8月, 2014 1 次提交
  3. 20 5月, 2014 1 次提交
  4. 09 4月, 2014 1 次提交
  5. 03 4月, 2014 2 次提交
    • H
      s390/uaccess: rework uaccess code - fix locking issues · 457f2180
      Heiko Carstens 提交于
      The current uaccess code uses a page table walk in some circumstances,
      e.g. in case of the in atomic futex operations or if running on old
      hardware which doesn't support the mvcos instruction.
      
      However it turned out that the page table walk code does not correctly
      lock page tables when accessing page table entries.
      In other words: a different cpu may invalidate a page table entry while
      the current cpu inspects the pte. This may lead to random data corruption.
      
      Adding correct locking however isn't trivial for all uaccess operations.
      Especially copy_in_user() is problematic since that requires to hold at
      least two locks, but must be protected against ABBA deadlock when a
      different cpu also performs a copy_in_user() operation.
      
      So the solution is a different approach where we change address spaces:
      
      User space runs in primary address mode, or access register mode within
      vdso code, like it currently already does.
      
      The kernel usually also runs in home space mode, however when accessing
      user space the kernel switches to primary or secondary address mode if
      the mvcos instruction is not available or if a compare-and-swap (futex)
      instruction on a user space address is performed.
      KVM however is special, since that requires the kernel to run in home
      address space while implicitly accessing user space with the sie
      instruction.
      
      So we end up with:
      
      User space:
      - runs in primary or access register mode
      - cr1 contains the user asce
      - cr7 contains the user asce
      - cr13 contains the kernel asce
      
      Kernel space:
      - runs in home space mode
      - cr1 contains the user or kernel asce
        -> the kernel asce is loaded when a uaccess requires primary or
           secondary address mode
      - cr7 contains the user or kernel asce, (changed with set_fs())
      - cr13 contains the kernel asce
      
      In case of uaccess the kernel changes to:
      - primary space mode in case of a uaccess (copy_to_user) and uses
        e.g. the mvcp instruction to access user space. However the kernel
        will stay in home space mode if the mvcos instruction is available
      - secondary space mode in case of futex atomic operations, so that the
        instructions come from primary address space and data from secondary
        space
      
      In case of kvm the kernel runs in home space mode, but cr1 gets switched
      to contain the gmap asce before the sie instruction gets executed. When
      the sie instruction is finished cr1 will be switched back to contain the
      user asce.
      
      A context switch between two processes will always load the kernel asce
      for the next process in cr1. So the first exit to user space is a bit
      more expensive (one extra load control register instruction) than before,
      however keeps the code rather simple.
      
      In sum this means there is no need to perform any error prone page table
      walks anymore when accessing user space.
      
      The patch seems to be rather large, however it mainly removes the
      the page table walk code and restores the previously deleted "standard"
      uaccess code, with a couple of changes.
      
      The uaccess without mvcos mode can be enforced with the "uaccess_primary"
      kernel parameter.
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457f2180
    • T
      s390/irq: Use defines for external interruption codes · 1dad093b
      Thomas Huth 提交于
      Use the new defines for external interruption codes to get rid
      of "magic" numbers in the s390 source code. And while we're at it,
      also rename the (un-)register_external_interrupt function to
      something shorter so that this patch does not exceed the 80
      columns all over the place.
      Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1dad093b
  6. 30 1月, 2014 1 次提交
  7. 04 11月, 2013 1 次提交
    • M
      s390/mm,tlb: correct tlb flush on page table upgrade · 10607864
      Martin Schwidefsky 提交于
      The IDTE instruction used to flush TLB entries for a specific address
      space uses the address-space-control element (ASCE) to identify
      affected TLB entries. The upgrade of a page table adds a new top
      level page table which changes the ASCE. The TLB entries associated
      with the old ASCE need to be flushed and the ASCE for the address space
      needs to be replaced synchronously on all CPUs which currently use it.
      The concept of a lazy ASCE update with an exception handler is broken.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      10607864
  8. 24 10月, 2013 1 次提交
  9. 13 9月, 2013 1 次提交
  10. 04 9月, 2013 1 次提交
  11. 15 7月, 2013 1 次提交
    • P
      s390: delete __cpuinit usage from all s390 files · e2741f17
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/s390 uses of the __cpuinit macros from
      all C files.  Currently s390 does not have any __CPUINIT used in
      assembly files.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: linux-s390@vger.kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      e2741f17
  12. 17 4月, 2013 1 次提交
    • M
      s390/mm: protection exception PSW for aborted transaction · f752ac4d
      Martin Schwidefsky 提交于
      Protection exception usually are suppressing and the fault handler
      needs to rewind the PSW by the instruction length to get the correct
      fault address. Except for protection exceptions while the CPU is in
      the middle of a transaction. The CPU stores the transaction abort
      PSW at the start of the transaction, if the transaction is aborted
      the PSW is already correct and may not be modified by the fault
      handler.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      f752ac4d
  13. 08 1月, 2013 1 次提交
    • H
      s390/irq: remove split irq fields from /proc/stat · 420f42ec
      Heiko Carstens 提交于
      Now that irq sum accounting for /proc/stat's "intr" line works again we
      have the oddity that the sum field (first field) contains only the sum
      of the second (external irqs) and third field (I/O interrupts).
      The reason for that is that these two fields are already sums of all other
      fields. So if we would sum up everything we would count every interrupt
      twice.
      This is broken since the split interrupt accounting was merged two years
      ago: 052ff461 "[S390] irq: have detailed
      statistics for interrupt types".
      To fix this remove the split interrupt fields from /proc/stat's "intr"
      line again and only have them in /proc/interrupts.
      
      This restores the old behaviour, seems to be the only sane fix and mimics
      a behaviour from other architectures where /proc/interrupts also contains
      more than /proc/stat's "intr" line does.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      420f42ec
  14. 23 11月, 2012 2 次提交
  15. 09 10月, 2012 1 次提交
  16. 26 9月, 2012 3 次提交
  17. 30 7月, 2012 4 次提交
  18. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  19. 16 5月, 2012 5 次提交
  20. 29 3月, 2012 1 次提交
  21. 11 3月, 2012 1 次提交
    • H
      [S390] irq: external interrupt code passing · fde15c3a
      Heiko Carstens 提交于
      The external interrupt handlers have a parameter called ext_int_code.
      Besides the name this paramter does not only contain the ext_int_code
      but in addition also the "cpu address" (POP) which caused the external
      interrupt.
      To make the code a bit more obvious pass a struct instead so the called
      function can easily distinguish between external interrupt code and
      cpu address. The cpu address field however is named "subcode" since
      some external interrupt sources do not pass a cpu address but a
      different parameter (or none at all).
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      fde15c3a
  22. 27 2月, 2012 1 次提交
  23. 27 12月, 2011 2 次提交
    • M
      [S390] cleanup trap handling · aa33c8cb
      Martin Schwidefsky 提交于
      Move the program interruption code and the translation exception identifier
      to the pt_regs structure as 'int_code' and 'int_parm_long' and make the
      first level interrupt handler in entry[64].S store the two values. That
      makes it possible to drop 'prot_addr' and 'trap_no' from the thread_struct
      and to reduce the number of arguments to a lot of functions. Finally
      un-inline do_trap. Overall this saves 5812 bytes in the .text section of
      the 64 bit kernel.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      aa33c8cb
    • C
      [S390] disable MACHINE_IS_VM check for pfault · f32269a0
      Carsten Otte 提交于
      This patch disables the check for MACHINE_IS_VM when initializing the
      pfault infrastructure. The code checks for successful completion of
      diag 258 anyway, thus it's safe to try initialization on LPAR anyway.
      This is needed to use pfault on kvm
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      f32269a0
  24. 14 11月, 2011 1 次提交
  25. 30 10月, 2011 3 次提交
  26. 24 7月, 2011 1 次提交
    • M
      [S390] kvm guest address space mapping · e5992f2e
      Martin Schwidefsky 提交于
      Add code that allows KVM to control the virtual memory layout that
      is seen by a guest. The guest address space uses a second page table
      that shares the last level pte-tables with the process page table.
      If a page is unmapped from the process page table it is automatically
      unmapped from the guest page table as well.
      
      The guest address space mapping starts out empty, KVM can map any
      individual 1MB segments from the process virtual memory to any 1MB
      aligned location in the guest virtual memory. If a target segment in
      the process virtual memory does not exist or is unmapped while a
      guest mapping exists the desired target address is stored as an
      invalid segment table entry in the guest page table.
      The population of the guest page table is fault driven.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      e5992f2e