1. 16 4月, 2018 1 次提交
  2. 11 4月, 2018 1 次提交
    • M
      s390: correct nospec auto detection init order · 6a3d1e81
      Martin Schwidefsky 提交于
      With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via
      early_initcall is done *after* the early_param functions. This
      overwrites any settings done with the nobp/no_spectre_v2/spectre_v2
      parameters. The code patching for the kernel is done after the
      evaluation of the early parameters but before the early_initcall
      is done. The end result is a kernel image that is patched correctly
      but the kernel modules are not.
      
      Make sure that the nospec auto detection function is called before the
      early parameters are evaluated and before the code patching is done.
      
      Fixes: 6e179d64 ("s390: add automatic detection of the spectre defense")
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      6a3d1e81
  3. 19 3月, 2018 1 次提交
  4. 15 3月, 2018 1 次提交
  5. 27 2月, 2018 1 次提交
  6. 07 2月, 2018 1 次提交
    • M
      s390: introduce execute-trampolines for branches · f19fbd5e
      Martin Schwidefsky 提交于
      Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and
      -mfunction_return= compiler options to create a kernel fortified against
      the specte v2 attack.
      
      With CONFIG_EXPOLINE=y all indirect branches will be issued with an
      execute type instruction. For z10 or newer the EXRL instruction will
      be used, for older machines the EX instruction. The typical indirect
      call
      
      	basr	%r14,%r1
      
      is replaced with a PC relative call to a new thunk
      
      	brasl	%r14,__s390x_indirect_jump_r1
      
      The thunk contains the EXRL/EX instruction to the indirect branch
      
      __s390x_indirect_jump_r1:
      	exrl	0,0f
      	j	.
      0:	br	%r1
      
      The detour via the execute type instruction has a performance impact.
      To get rid of the detour the new kernel parameter "nospectre_v2" and
      "spectre_v2=[on,off,auto]" can be used. If the parameter is specified
      the kernel and module code will be patched at runtime.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      f19fbd5e
  7. 05 2月, 2018 1 次提交
  8. 24 11月, 2017 1 次提交
    • G
      s390: kernel: add SPDX identifiers to the remaining files · a17ae4c3
      Greg Kroah-Hartman 提交于
      It's good to have SPDX identifiers in all files to make it easier to
      audit the kernel tree for correct licenses.
      
      Update the arch/s390/kernel/ files with the correct SPDX license
      identifier based on the license text in the file itself.  The SPDX
      identifier is a legally binding shorthand, which can be used instead of
      the full boiler plate text.
      
      This work is based on a script and data from Thomas Gleixner, Philippe
      Ombredanne, and Kate Stewart.
      
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      a17ae4c3
  9. 09 11月, 2017 2 次提交
  10. 19 10月, 2017 1 次提交
  11. 18 10月, 2017 2 次提交
    • M
      s390/vdso: move boot_vdso_data to vdso.c · 608796ff
      Martin Schwidefsky 提交于
      The boot_vdso_data variable is related to the vdso code, the magic of the
      initial vdso area for the early boot and the replacement of it in vdso_init
      should all be put into vdso.c.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      608796ff
    • V
      s390: introduce CPU alternatives · 686140a1
      Vasily Gorbik 提交于
      Implement CPU alternatives, which allows to optionally patch newer
      instructions at runtime, based on CPU facilities availability.
      
      A new kernel boot parameter "noaltinstr" disables patching.
      
      Current implementation is derived from x86 alternatives. Although
      ideal instructions padding (when altinstr is longer then oldinstr)
      is added at compile time, and no oldinstr nops optimization has to be
      done at runtime. Also couple of compile time sanity checks are done:
      1. oldinstr and altinstr must be <= 254 bytes long,
      2. oldinstr and altinstr must not have an odd length.
      
      alternative(oldinstr, altinstr, facility);
      alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2);
      
      Both compile time and runtime padding consists of either 6/4/2 bytes nop
      or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes.
      
      .altinstructions and .altinstr_replacement sections are part of
      __init_begin : __init_end region and are freed after initialization.
      Signed-off-by: NVasily Gorbik <gor@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      686140a1
  12. 28 9月, 2017 1 次提交
    • M
      s390/spinlock: introduce spinlock wait queueing · b96f7d88
      Martin Schwidefsky 提交于
      The queued spinlock code for s390 follows the principles of the common
      code qspinlock implementation but with a few notable differences.
      
      The format of the spinlock_t locking word differs, s390 needs to store
      the logical CPU number of the lock holder in the spinlock_t to be able
      to use the diagnose 9c directed yield hypervisor call.
      
      The inline code sequences for spin_lock and spin_unlock are nice and
      short. The inline portion of a spin_lock now typically looks like this:
      
      	lhi	%r0,0			# 0 indicates an empty lock
      	l	%r1,0x3a0		# CPU number + 1 from lowcore
      	cs	%r0,%r1,<some_lock>	# lock operation
      	jnz	call_wait		# on failure call wait function
      locked:
      	...
      call_wait:
      	la	%r2,<some_lock>
      	brasl	%r14,arch_spin_lock_wait
      	j	locked
      
      A spin_unlock is as simple as before:
      
      	lhi	%r0,0
      	sth	%r0,2(%r2)		# unlock operation
      
      After a CPU has queued itself it may not enable interrupts again for the
      arch_spin_lock_flags() variant. The arch_spin_lock_wait_flags wait function
      is removed.
      
      To improve performance the code implements opportunistic lock stealing.
      If the wait function finds a spinlock_t that indicates that the lock is
      free but there are queued waiters, the CPU may steal the lock up to three
      times without queueing itself. The lock stealing update the steal counter
      in the lock word to prevent more than 3 steals. The counter is reset at
      the time the CPU next in the queue successfully takes the lock.
      
      While the queued spinlocks improve performance in a system with dedicated
      CPUs, in a virtualized environment with continuously overcommitted CPUs
      the queued spinlocks can have a negative effect on performance. This
      is due to the fact that a queued CPU that is preempted by the hypervisor
      will block the queue at some point even without holding the lock. With
      the classic spinlock it does not matter if a CPU is preempted that waits
      for the lock. Therefore use the queued spinlock code only if the system
      runs with dedicated CPUs and fall back to classic spinlocks when running
      with shared CPUs.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b96f7d88
  13. 09 8月, 2017 1 次提交
    • H
      s390/vmcp: make use of contiguous memory allocator · 3f429842
      Heiko Carstens 提交于
      If memory is fragmented it is unlikely that large order memory
      allocations succeed. This has been an issue with the vmcp device
      driver since a long time, since it requires large physical contiguous
      memory ares for large responses.
      
      To hopefully resolve this issue make use of the contiguous memory
      allocator (cma). This patch adds a vmcp specific vmcp cma area with a
      default size of 4MB. The size can be changed either via the
      VMCP_CMA_SIZE config option at compile time or with the "vmcp_cma"
      kernel parameter (e.g. "vmcp_cma=16m").
      
      For any vmcp response buffers larger than 16k memory from the cma area
      will be allocated. If such an allocation fails, there is a fallback to
      the buddy allocator.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      3f429842
  14. 26 7月, 2017 3 次提交
  15. 13 7月, 2017 1 次提交
  16. 22 3月, 2017 1 次提交
    • M
      s390: add a system call for guarded storage · 916cda1a
      Martin Schwidefsky 提交于
      This adds a new system call to enable the use of guarded storage for
      user space processes. The system call takes two arguments, a command
      and pointer to a guarded storage control block:
      
          s390_guarded_storage(int command, struct gs_cb *gs_cb);
      
      The second argument is relevant only for the GS_SET_BC_CB command.
      
      The commands in detail:
      
      0 - GS_ENABLE
          Enable the guarded storage facility for the current task. The
          initial content of the guarded storage control block will be
          all zeros. After the enablement the user space code can use
          load-guarded-storage-controls instruction (LGSC) to load an
          arbitrary control block. While a task is enabled the kernel
          will save and restore the current content of the guarded
          storage registers on context switch.
      1 - GS_DISABLE
          Disables the use of the guarded storage facility for the current
          task. The kernel will cease to save and restore the content of
          the guarded storage registers, the task specific content of
          these registers is lost.
      2 - GS_SET_BC_CB
          Set a broadcast guarded storage control block. This is called
          per thread and stores a specific guarded storage control block
          in the task struct of the current task. This control block will
          be used for the broadcast event GS_BROADCAST.
      3 - GS_CLEAR_BC_CB
          Clears the broadcast guarded storage control block. The guarded-
          storage control block is removed from the task struct that was
          established by GS_SET_BC_CB.
      4 - GS_BROADCAST
          Sends a broadcast to all thread siblings of the current task.
          Every sibling that has established a broadcast guarded storage
          control block will load this control block and will be enabled
          for guarded storage. The broadcast guarded storage control block
          is used up, a second broadcast without a refresh of the stored
          control block with GS_SET_BC_CB will not have any effect.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      916cda1a
  17. 02 3月, 2017 2 次提交
  18. 17 2月, 2017 1 次提交
  19. 08 2月, 2017 2 次提交
  20. 07 2月, 2017 1 次提交
  21. 16 1月, 2017 1 次提交
  22. 14 12月, 2016 1 次提交
  23. 07 12月, 2016 2 次提交
    • H
      s390/numa: establish cpu to node mapping early · 8c910580
      Heiko Carstens 提交于
      Initialize the cpu topology and therefore also the cpu to node mapping
      much earlier. Fixes this warning and subsequent crashes when using the
      fake numa emulation mode on s390:
      
      WARNING: CPU: 0 PID: 1 at include/linux/cpumask.h:121 select_task_rq+0xe6/0x1a8
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00001-ge9d867a6-dirty #28
      task: 00000001dd270008 ti: 00000001eccb4000 task.ti: 00000001eccb4000
      Krnl PSW : 0404c00180000000 0000000000176c56 (select_task_rq+0xe6/0x1a8)
                 R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
      Call Trace:
      ([<0000000000176c30>] select_task_rq+0xc0/0x1a8)
      ([<0000000000177d64>] try_to_wake_up+0x2e4/0x478)
      ([<000000000015d46c>] create_worker+0x174/0x1c0)
      ([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438)
      ([<0000000000162550>] apply_wqattrs_prepare+0x200/0x2a0)
      ([<000000000016266a>] apply_workqueue_attrs_locked+0x7a/0xb0)
      ([<0000000000162af0>] apply_workqueue_attrs+0x50/0x78)
      ([<000000000016441c>] __alloc_workqueue_key+0x304/0x520)
      ([<0000000000ee3706>] default_bdi_init+0x3e/0x70)
      ([<0000000000100270>] do_one_initcall+0x140/0x1d8)
      ([<0000000000ec9da8>] kernel_init_freeable+0x220/0x2d8)
      ([<0000000000984a7a>] kernel_init+0x2a/0x150)
      ([<00000000009913fa>] kernel_thread_starter+0x6/0xc)
      ([<00000000009913f4>] kernel_thread_starter+0x0/0xc)
      Reviewed-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      8c910580
    • H
      s390/smp: initialize cpu_present_mask in setup_arch · af51160e
      Heiko Carstens 提交于
      In order to be able to setup the cpu to node mappings early it is a
      prerequisite to know which cpus are present. Therefore cpus must be
      detected much earlier than before.
      
      For sclp based cpu detection this requires yet another early sclp
      call, since the system is not ready to use the regular interrupt and
      memory allocations.
      Reviewed-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      af51160e
  24. 02 12月, 2016 2 次提交
    • H
      s390/setup: fix memblock usage · db7ad636
      Heiko Carstens 提交于
      When converting from bootmem to memblock I missed a subtle difference:
      the memblock_alloc() functions return uninitialized memory, while the
      memblock_virt_alloc() functions return zeroed memory.
      
      This led to quite random early boot crashes.
      
      Therefore use the correct version everywhere now.
      Hopefully.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      db7ad636
    • H
      s390/kexec: use node 0 when re-adding crash kernel memory · 9f88eb4d
      Heiko Carstens 提交于
      When re-adding crash kernel memory within setup_resources() the
      function memblock_add() is used. That function will add memory by
      default to node "MAX_NUMNODES" instead of node 0, like the memory
      detection code does. In case of !NUMA this will trigger this warning
      when the kernel generates the vmemmap:
      
      Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
      WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16
      Call Trace:
       [<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8
       [<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50
       [<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0
       [<0000000000840136>] sparse_mem_map_populate+0x46/0x68
       [<0000000000d0c59c>] sparse_init+0x184/0x238
       [<0000000000cf45f6>] paging_init+0xbe/0xf8
       [<0000000000cf1d4a>] setup_arch+0xa02/0xae0
       [<0000000000ced75a>] start_kernel+0x72/0x450
       [<0000000000100020>] _stext+0x20/0x80
      
      If NUMA is selected numa_setup_memory() will fix the node assignments
      before the vmemmap will be populated; so this warning will only appear
      if NUMA is not selected.
      
      To fix this simply use memblock_add_node() and re-add crash kernel
      memory explicitly to node 0.
      Reported-and-tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Fixes: 4e042af4 ("s390/kexec: fix crash on resize of reserved memory")
      Cc: <stable@vger.kernel.org> # v4.8+
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      9f88eb4d
  25. 29 11月, 2016 1 次提交
  26. 23 11月, 2016 1 次提交
  27. 11 11月, 2016 2 次提交
  28. 27 8月, 2016 1 次提交
  29. 13 7月, 2016 1 次提交
  30. 13 6月, 2016 2 次提交