1. 21 11月, 2019 1 次提交
    • V
      s390/vdso: avoid 64-bit vdso mapping for compat tasks · 84bfa034
      Vasily Gorbik 提交于
      [ Upstream commit d1befa65823e9c6d013883b8a41d081ec338c489 ]
      
      vdso_fault used is_compat_task function (on s390 it tests "current"
      thread_info flags) to distinguish compat tasks and map 31-bit vdso
      pages. But "current" task might not correspond to mm context.
      
      When 31-bit compat inferior is executed under gdb, gdb does
      PTRACE_PEEKTEXT on vdso page, causing vdso_fault with "current" being
      64-bit gdb process. So, 31-bit inferior ends up with 64-bit vdso mapped.
      
      To avoid this problem a new compat_mm flag has been introduced into
      mm context. This flag is used in vdso_fault and vdso_mremap instead
      of is_compat_task.
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      84bfa034
  2. 13 11月, 2019 1 次提交
  3. 06 11月, 2019 3 次提交
    • H
      s390/idle: fix cpu idle time calculation · 8dd60660
      Heiko Carstens 提交于
      commit 3d7efa4edd07be5c5c3ffa95ba63e97e070e1f3f upstream.
      
      The idle time reported in /proc/stat sometimes incorrectly contains
      huge values on s390. This is caused by a bug in arch_cpu_idle_time().
      
      The kernel tries to figure out when a different cpu entered idle by
      accessing its per-cpu data structure. There is an ordering problem: if
      the remote cpu has an idle_enter value which is not zero, and an
      idle_exit value which is zero, it is assumed it is idle since
      "now". The "now" timestamp however is taken before the idle_enter
      value is read.
      
      Which in turn means that "now" can be smaller than idle_enter of the
      remote cpu. Unconditionally subtracting idle_enter from "now" can thus
      lead to a negative value (aka large unsigned value).
      
      Fix this by moving the get_tod_clock() invocation out of the
      loop. While at it also make the code a bit more readable.
      
      A similar bug also exists for show_idle_time(). Fix this is as well.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dd60660
    • Y
      s390/cmm: fix information leak in cmm_timeout_handler() · ced8cb02
      Yihui ZENG 提交于
      commit b8e51a6a9db94bc1fb18ae831b3dab106b5a4b5f upstream.
      
      The problem is that we were putting the NUL terminator too far:
      
      	buf[sizeof(buf) - 1] = '\0';
      
      If the user input isn't NUL terminated and they haven't initialized the
      whole buffer then it leads to an info leak.  The NUL terminator should
      be:
      
      	buf[len - 1] = '\0';
      Signed-off-by: NYihui Zeng <yzeng56@asu.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      [heiko.carstens@de.ibm.com: keep semantics of how *lenp and *ppos are handled]
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ced8cb02
    • C
      s390/uaccess: avoid (false positive) compiler warnings · 12e13266
      Christian Borntraeger 提交于
      [ Upstream commit 062795fcdcb2d22822fb42644b1d76a8ad8439b3 ]
      
      Depending on inlining decisions by the compiler, __get/put_user_fn
      might become out of line. Then the compiler is no longer able to tell
      that size can only be 1,2,4 or 8 due to the check in __get/put_user
      resulting in false positives like
      
      ./arch/s390/include/asm/uaccess.h: In function ‘__put_user_fn’:
      ./arch/s390/include/asm/uaccess.h:113:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized]
        113 |  return rc;
            |         ^~
      ./arch/s390/include/asm/uaccess.h: In function ‘__get_user_fn’:
      ./arch/s390/include/asm/uaccess.h:143:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized]
        143 |  return rc;
            |         ^~
      
      These functions are supposed to be always inlined. Mark it as such.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      12e13266
  4. 12 10月, 2019 3 次提交
  5. 08 10月, 2019 1 次提交
  6. 05 10月, 2019 1 次提交
  7. 21 9月, 2019 2 次提交
  8. 19 9月, 2019 2 次提交
  9. 29 8月, 2019 1 次提交
  10. 16 8月, 2019 1 次提交
  11. 21 7月, 2019 1 次提交
    • H
      s390: fix stfle zero padding · 02eb533e
      Heiko Carstens 提交于
      commit 4f18d869ffd056c7858f3d617c71345cf19be008 upstream.
      
      The stfle inline assembly returns the number of double words written
      (condition code 0) or the double words it would have written
      (condition code 3), if the memory array it got as parameter would have
      been large enough.
      
      The current stfle implementation assumes that the array is always
      large enough and clears those parts of the array that have not been
      written to with a subsequent memset call.
      
      If however the array is not large enough memset will get a negative
      length parameter, which means that memset clears memory until it gets
      an exception and the kernel crashes.
      
      To fix this simply limit the maximum length. Move also the inline
      assembly to an extra function to avoid clobbering of register 0, which
      might happen because of the added min_t invocation together with code
      instrumentation.
      
      The bug was introduced with commit 14375bc4 ("[S390] cleanup
      facility list handling") but was rather harmless, since it would only
      write to a rather large array. It became a potential problem with
      commit 3ab121ab ("[S390] kernel: Add z/VM LGR detection"). Since
      then it writes to an array with only four double words, while some
      machines already deliver three double words. As soon as machines have
      a facility bit within the fifth double a crash on IPL would happen.
      
      Fixes: 14375bc4 ("[S390] cleanup facility list handling")
      Cc: <stable@vger.kernel.org> # v2.6.37+
      Reviewed-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02eb533e
  12. 14 7月, 2019 1 次提交
  13. 25 6月, 2019 2 次提交
  14. 19 6月, 2019 2 次提交
  15. 11 6月, 2019 1 次提交
  16. 09 6月, 2019 3 次提交
  17. 04 6月, 2019 1 次提交
  18. 31 5月, 2019 2 次提交
  19. 15 5月, 2019 1 次提交
    • J
      s390/speculation: Support 'mitigations=' cmdline option · 59a14fb5
      Josh Poimboeuf 提交于
      commit 0336e04a6520bdaefdb0769d2a70084fa52e81ed upstream
      
      Configure s390 runtime CPU speculation bug mitigations in accordance
      with the 'mitigations=' cmdline option.  This affects Spectre v1 and
      Spectre v2.
      
      The default behavior is unchanged.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86)
      Reviewed-by: NJiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-s390@vger.kernel.org
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-arch@vger.kernel.org
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Steven Price <steven.price@arm.com>
      Cc: Phil Auld <pauld@redhat.com>
      Link: https://lkml.kernel.org/r/e4a161805458a5ec88812aac0307ae3908a030fc.1555085500.git.jpoimboe@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59a14fb5
  20. 04 5月, 2019 1 次提交
  21. 06 4月, 2019 1 次提交
  22. 24 3月, 2019 3 次提交
    • M
      s390/setup: fix boot crash for machine without EDAT-1 · 3053cb97
      Martin Schwidefsky 提交于
      commit 86a86804e4f18fc3880541b3d5a07f4df0fe29cb upstream.
      
      The fix to make WARN work in the early boot code created a problem
      on older machines without EDAT-1. The setup_lowcore_dat_on function
      uses the pointer from lowcore_ptr[0] to set the DAT bit in the new
      PSWs. That does not work if the kernel page table is set up with
      4K pages as the prefix address maps to absolute zero.
      
      To make this work the PSWs need to be changed with via address 0 in
      form of the S390_lowcore definition.
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NCornelia Huck <cohuck@redhat.com>
      Fixes: 94f85ed3e2f8 ("s390/setup: fix early warning messages")
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3053cb97
    • S
      KVM: Call kvm_arch_memslots_updated() before updating memslots · 23ad135a
      Sean Christopherson 提交于
      commit 152482580a1b0accb60676063a1ac57b2d12daf6 upstream.
      
      kvm_arch_memslots_updated() is at this point in time an x86-specific
      hook for handling MMIO generation wraparound.  x86 stashes 19 bits of
      the memslots generation number in its MMIO sptes in order to avoid
      full page fault walks for repeat faults on emulated MMIO addresses.
      Because only 19 bits are used, wrapping the MMIO generation number is
      possible, if unlikely.  kvm_arch_memslots_updated() alerts x86 that
      the generation has changed so that it can invalidate all MMIO sptes in
      case the effective MMIO generation has wrapped so as to avoid using a
      stale spte, e.g. a (very) old spte that was created with generation==0.
      
      Given that the purpose of kvm_arch_memslots_updated() is to prevent
      consuming stale entries, it needs to be called before the new generation
      is propagated to memslots.  Invalidating the MMIO sptes after updating
      memslots means that there is a window where a vCPU could dereference
      the new memslots generation, e.g. 0, and incorrectly reuse an old MMIO
      spte that was created with (pre-wrap) generation==0.
      
      Fixes: e59dbe09 ("KVM: Introduce kvm_arch_memslots_updated()")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23ad135a
    • M
      s390/setup: fix early warning messages · b52bdf53
      Martin Schwidefsky 提交于
      commit 8727638426b0aea59d7f904ad8ddf483f9234f88 upstream.
      
      The setup_lowcore() function creates a new prefix page for the boot CPU.
      The PSW mask for the system_call, external interrupt, i/o interrupt and
      the program check handler have the DAT bit set in this new prefix page.
      
      At the time setup_lowcore is called the system still runs without virtual
      address translation, the paging_init() function creates the kernel page
      table and loads the CR13 with the kernel ASCE.
      
      Any code between setup_lowcore() and the end of paging_init() that has
      a BUG or WARN statement will create a program check that can not be
      handled correctly as there is no kernel page table yet.
      
      To allow early WARN statements initially setup the lowcore with DAT off
      and set the DAT bit only after paging_init() has completed.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b52bdf53
  23. 13 2月, 2019 1 次提交
    • H
      s390/zcrypt: improve special ap message cmd handling · 87ae7932
      Harald Freudenberger 提交于
      [ Upstream commit be534791011100d204602e2e0496e9e6ce8edf63 ]
      
      There exist very few ap messages which need to have the 'special' flag
      enabled. This flag tells the firmware layer to do some pre- and maybe
      postprocessing. However, it may happen that this special flag is
      enabled but the firmware is unable to deal with this kind of message
      and thus returns with reply code 0x41. For example older firmware may
      not know the newest messages triggered by the zcrypt device driver and
      thus react with reject and the named reply code. Unfortunately this
      reply code is not known to the zcrypt error routines and thus default
      behavior is to switch the ap queue offline.
      
      This patch now makes the ap error routine aware of the reply code and
      so userspace is informed about the bad processing result but the queue
      is not switched to offline state any more.
      Signed-off-by: NHarald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      87ae7932
  24. 31 1月, 2019 4 次提交
    • D
      s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU · 48046a01
      David Hildenbrand 提交于
      commit 60f1bf29c0b2519989927cae640cd1f50f59dc7f upstream.
      
      When calling smp_call_ipl_cpu() from the IPL CPU, we will try to read
      from pcpu_devices->lowcore. However, due to prefixing, that will result
      in reading from absolute address 0 on that CPU. We have to go via the
      actual lowcore instead.
      
      This means that right now, we will read lc->nodat_stack == 0 and
      therfore work on a very wrong stack.
      
      This BUG essentially broke rebooting under QEMU TCG (which will report
      a low address protection exception). And checking under KVM, it is
      also broken under KVM. With 1 VCPU it can be easily triggered.
      
      :/# echo 1 > /proc/sys/kernel/sysrq
      :/# echo b > /proc/sysrq-trigger
      [   28.476745] sysrq: SysRq : Resetting
      [   28.476793] Kernel stack overflow.
      [   28.476817] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
      [   28.476820] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
      [   28.476826] Krnl PSW : 0400c00180000000 0000000000115c0c (pcpu_delegate+0x12c/0x140)
      [   28.476861]            R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
      [   28.476863] Krnl GPRS: ffffffffffffffff 0000000000000000 000000000010dff8 0000000000000000
      [   28.476864]            0000000000000000 0000000000000000 0000000000ab7090 000003e0006efbf0
      [   28.476864]            000000000010dff8 0000000000000000 0000000000000000 0000000000000000
      [   28.476865]            000000007fffc000 0000000000730408 000003e0006efc58 0000000000000000
      [   28.476887] Krnl Code: 0000000000115bfe: 4170f000            la      %r7,0(%r15)
      [   28.476887]            0000000000115c02: 41f0a000            la      %r15,0(%r10)
      [   28.476887]           #0000000000115c06: e370f0980024        stg     %r7,152(%r15)
      [   28.476887]           >0000000000115c0c: c0e5fffff86e        brasl   %r14,114ce8
      [   28.476887]            0000000000115c12: 41f07000            la      %r15,0(%r7)
      [   28.476887]            0000000000115c16: a7f4ffa8            brc     15,115b66
      [   28.476887]            0000000000115c1a: 0707                bcr     0,%r7
      [   28.476887]            0000000000115c1c: 0707                bcr     0,%r7
      [   28.476901] Call Trace:
      [   28.476902] Last Breaking-Event-Address:
      [   28.476920]  [<0000000000a01c4a>] arch_call_rest_init+0x22/0x80
      [   28.476927] Kernel panic - not syncing: Corrupt kernel stack, can't continue.
      [   28.476930] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
      [   28.476932] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
      [   28.476932] Call Trace:
      
      Fixes: 2f859d0d ("s390/smp: reduce size of struct pcpu")
      Cc: stable@vger.kernel.org # 4.0+
      Reported-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      48046a01
    • G
      s390/smp: fix CPU hotplug deadlock with CPU rescan · 049c7b06
      Gerald Schaefer 提交于
      commit b7cb707c373094ce4008d4a6ac9b6b366ec52da5 upstream.
      
      smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
      to a dedlock when a new CPU is found and immediately set online by a udev
      rule.
      
      This was observed on an older kernel version, where the cpu_hotplug_begin()
      loop was still present, and it resulted in hanging chcpu and systemd-udev
      processes. This specific deadlock will not show on current kernels. However,
      there may be other possible deadlocks, and since smp_rescan_cpus() can still
      trigger a CPU hotplug operation, the device_hotplug_lock should be held.
      
      For reference, this was the deadlock with the old cpu_hotplug_begin() loop:
      
              chcpu (rescan)                       systemd-udevd
      
       echo 1 > /sys/../rescan
       -> smp_rescan_cpus()
       -> (*) get_online_cpus()
          (increases refcount)
       -> smp_add_present_cpu()
          (new CPU found)
       -> register_cpu()
       -> device_add()
       -> udev "add" event triggered -----------> udev rule sets CPU online
                                               -> echo 1 > /sys/.../online
                                               -> lock_device_hotplug_sysfs()
                                                  (this is missing in rescan path)
                                               -> device_online()
                                               -> (**) device_lock(new CPU dev)
                                               -> cpu_up()
                                               -> cpu_hotplug_begin()
                                                  (loops until refcount == 0)
                                                  -> deadlock with (*)
       -> bus_probe_device()
       -> device_attach()
       -> device_lock(new CPU dev)
          -> deadlock with (**)
      
      Fix this by taking the device_hotplug_lock in the CPU rescan path.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      049c7b06
    • C
      s390/early: improve machine detection · e0d573a0
      Christian Borntraeger 提交于
      commit 03aa047ef2db4985e444af6ee1c1dd084ad9fb4c upstream.
      
      Right now the early machine detection code check stsi 3.2.2 for "KVM"
      and set MACHINE_IS_VM if this is different. As the console detection
      uses diagnose 8 if MACHINE_IS_VM returns true this will crash Linux
      early for any non z/VM system that sets a different value than KVM.
      So instead of assuming z/VM, do not set any of MACHINE_IS_LPAR,
      MACHINE_IS_VM, or MACHINE_IS_KVM.
      
      CC: stable@vger.kernel.org
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0d573a0
    • M
      s390/mm: always force a load of the primary ASCE on context switch · b5637644
      Martin Schwidefsky 提交于
      commit a38662084c8bdb829ff486468c7ea801c13fcc34 upstream.
      
      The ASCE of an mm_struct can be modified after a task has been created,
      e.g. via crst_table_downgrade for a compat process. The active_mm logic
      to avoid the switch_mm call if the next task is a kernel thread can
      lead to a situation where switch_mm is called where 'prev == next' is
      true but 'prev->context.asce == next->context.asce' is not.
      
      This can lead to a situation where a CPU uses the outdated ASCE to run
      a task. The result can be a crash, endless loops and really subtle
      problem due to TLBs being created with an invalid ASCE.
      
      Cc: stable@kernel.org # v3.15+
      Fixes: 53e857f3 ("s390/mm,tlb: race of lazy TLB flush vs. recreation")
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5637644