1. 23 11月, 2016 1 次提交
  2. 15 11月, 2016 1 次提交
  3. 11 11月, 2016 4 次提交
  4. 10 5月, 2016 1 次提交
  5. 27 11月, 2015 1 次提交
  6. 14 10月, 2015 1 次提交
  7. 13 4月, 2015 1 次提交
  8. 25 3月, 2015 2 次提交
  9. 13 2月, 2015 1 次提交
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
  10. 25 9月, 2014 1 次提交
  11. 20 5月, 2014 2 次提交
    • M
      s390: split TIF bits into CIF, PIF and TIF bits · d3a73acb
      Martin Schwidefsky 提交于
      The oi and ni instructions used in entry[64].S to set and clear bits
      in the thread-flags are not guaranteed to be atomic in regard to other
      CPUs. Split the TIF bits into CPU, pt_regs and thread-info specific
      bits. Updates on the TIF bits are done with atomic instructions,
      updates on CPU and pt_regs bits are done with non-atomic instructions.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d3a73acb
    • M
      s390/uaccess: simplify control register updates · beef560b
      Martin Schwidefsky 提交于
      Always switch to the kernel ASCE in switch_mm. Load the secondary
      space ASCE in finish_arch_post_lock_switch after checking that
      any pending page table operations have completed. The primary
      ASCE is loaded in entry[64].S. With this the update_primary_asce
      call can be removed from the switch_to macro and from the start
      of switch_mm function. Remove the load_primary argument from
      update_user_asce/clear_user_asce, rename update_user_asce to
      set_user_asce and rename update_primary_asce to load_kernel_asce.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      beef560b
  12. 03 4月, 2014 1 次提交
    • H
      s390/uaccess: rework uaccess code - fix locking issues · 457f2180
      Heiko Carstens 提交于
      The current uaccess code uses a page table walk in some circumstances,
      e.g. in case of the in atomic futex operations or if running on old
      hardware which doesn't support the mvcos instruction.
      
      However it turned out that the page table walk code does not correctly
      lock page tables when accessing page table entries.
      In other words: a different cpu may invalidate a page table entry while
      the current cpu inspects the pte. This may lead to random data corruption.
      
      Adding correct locking however isn't trivial for all uaccess operations.
      Especially copy_in_user() is problematic since that requires to hold at
      least two locks, but must be protected against ABBA deadlock when a
      different cpu also performs a copy_in_user() operation.
      
      So the solution is a different approach where we change address spaces:
      
      User space runs in primary address mode, or access register mode within
      vdso code, like it currently already does.
      
      The kernel usually also runs in home space mode, however when accessing
      user space the kernel switches to primary or secondary address mode if
      the mvcos instruction is not available or if a compare-and-swap (futex)
      instruction on a user space address is performed.
      KVM however is special, since that requires the kernel to run in home
      address space while implicitly accessing user space with the sie
      instruction.
      
      So we end up with:
      
      User space:
      - runs in primary or access register mode
      - cr1 contains the user asce
      - cr7 contains the user asce
      - cr13 contains the kernel asce
      
      Kernel space:
      - runs in home space mode
      - cr1 contains the user or kernel asce
        -> the kernel asce is loaded when a uaccess requires primary or
           secondary address mode
      - cr7 contains the user or kernel asce, (changed with set_fs())
      - cr13 contains the kernel asce
      
      In case of uaccess the kernel changes to:
      - primary space mode in case of a uaccess (copy_to_user) and uses
        e.g. the mvcp instruction to access user space. However the kernel
        will stay in home space mode if the mvcos instruction is available
      - secondary space mode in case of futex atomic operations, so that the
        instructions come from primary address space and data from secondary
        space
      
      In case of kvm the kernel runs in home space mode, but cr1 gets switched
      to contain the gmap asce before the sie instruction gets executed. When
      the sie instruction is finished cr1 will be switched back to contain the
      user asce.
      
      A context switch between two processes will always load the kernel asce
      for the next process in cr1. So the first exit to user space is a bit
      more expensive (one extra load control register instruction) than before,
      however keeps the code rather simple.
      
      In sum this means there is no need to perform any error prone page table
      walks anymore when accessing user space.
      
      The patch seems to be rather large, however it mainly removes the
      the page table walk code and restores the previously deleted "standard"
      uaccess code, with a couple of changes.
      
      The uaccess without mvcos mode can be enforced with the "uaccess_primary"
      kernel parameter.
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457f2180
  13. 14 3月, 2014 1 次提交
  14. 21 2月, 2014 1 次提交
    • M
      s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries · 53e857f3
      Martin Schwidefsky 提交于
      Git commit 050eef36 "[S390] fix tlb flushing vs. concurrent
      /proc accesses" introduced the attach counter to avoid using the
      mm_users value to decide between IPTE for every PTE and lazy TLB
      flushing with IDTE. That fixed the problem with mm_users but it
      introduced another subtle race, fortunately one that is very hard
      to hit.
      The background is the requirement of the architecture that a valid
      PTE may not be changed while it can be used concurrently by another
      cpu. The decision between IPTE and lazy TLB flushing needs to be
      done while the PTE is still valid. Now if the virtual cpu is
      temporarily stopped after the decision to use lazy TLB flushing but
      before the invalid bit of the PTE has been set, another cpu can attach
      the mm, find that flush_mm is set, do the IDTE, return to userspace,
      and recreate a TLB that uses the PTE in question. When the first,
      stopped cpu continues it will change the PTE while it is attached on
      another cpu. The first cpu will do another IDTE shortly after the
      modification of the PTE which makes the race window quite short.
      
      To fix this race the CPU that wants to attach the address space of a
      user space thread needs to wait for the end of the PTE modification.
      The number of concurrent TLB flushers for an mm is tracked in the
      upper 16 bits of the attach_count and finish_arch_post_lock_switch
      is used to wait for the end of the flush operation if required.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      53e857f3
  15. 14 11月, 2013 1 次提交
  16. 26 4月, 2013 2 次提交
  17. 01 10月, 2012 2 次提交
  18. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  19. 24 5月, 2012 2 次提交
  20. 16 5月, 2012 1 次提交
  21. 22 11月, 2011 1 次提交
  22. 30 10月, 2011 3 次提交
    • M
      [S390] add TIF_SYSCALL thread flag · b6ef5bb3
      Martin Schwidefsky 提交于
      Add an explicit TIF_SYSCALL bit that indicates if a task is inside
      a system call. The svc_code in the pt_regs structure is now only
      valid if TIF_SYSCALL is set. With this definition TIF_RESTART_SVC
      can be replaced with TIF_SYSCALL. Overall do_signal is a bit more
      readable and it saves a few lines of code.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b6ef5bb3
    • M
      [S390] signal race with restarting system calls · 20b40a79
      Martin Schwidefsky 提交于
      For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
      do_signal will prepare the restart of the system call with a rewind of
      the PSW before calling get_signal_to_deliver (where the debugger might
      take control). For A ERESTART_RESTARTBLOCK restarting system call
      do_signal will set -EINTR as return code.
      There are two issues with this approach:
      1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
         ERESTART_RESTARTBLOCK as the rewinding already took place or the
         return code has been changed to -EINTR
      2) if get_signal_to_deliver does not return with a signal to deliver
         the restart via the repeat of the svc instruction is left in place.
         This opens a race if another signal is made pending before the
         system call instruction can be reexecuted. The original system call
         will be restarted even if the second signal would have ended the
         system call with -EINTR.
      
      These two issues can be solved by dropping the early rewind of the
      system call before get_signal_to_deliver has been called and by using
      the TIF_RESTART_SVC magic to do the restart if no signal has to be
      delivered. The only situation where the system call restart via the
      repeat of the svc instruction is appropriate is when a SA_RESTART
      signal is delivered to user space.
      
      Unfortunately this breaks inferior calls by the debugger again. The
      system call number and the length of the system call instruction is
      lost over the inferior call and user space will see ERESTARTNOHAND/
      ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
      new ptrace interface is added to save/restore the system call number
      and system call instruction length.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      20b40a79
    • T
      [S390] fix _TIF_SINGLE_STEP definition · 80853a8a
      Tejun Heo 提交于
      _TIF_SINGLE_STEP is incorrectly defined as 1<<TIF_FREEZE.  Fix it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      80853a8a
  23. 24 7月, 2011 1 次提交
    • M
      [S390] move sie code to entry.S · 603d1a50
      Martin Schwidefsky 提交于
      The entry to / exit from sie has subtle dependencies to the first level
      interrupt handler. Move the sie assembler code to entry64.S and replace
      the SIE_HOOK callback with a test and the new _TIF_SIE bit.
      In addition this patch fixes several problems in regard to the check for
      the_TIF_EXIT_SIE bits. The old code checked the TIF bits before executing
      the interrupt handler and it only modified the instruction address if it
      pointed directly to the sie instruction. In both cases it could miss
      a TIF bit that normally would cause an exit from the guest and would
      reenter the guest context.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      603d1a50
  24. 12 1月, 2011 1 次提交
  25. 05 1月, 2011 2 次提交
  26. 17 5月, 2010 1 次提交
  27. 14 5月, 2010 1 次提交
  28. 27 2月, 2010 1 次提交
  29. 14 1月, 2010 1 次提交