1. 28 9月, 2017 1 次提交
    • M
      s390/spinlock: introduce spinlock wait queueing · b96f7d88
      Martin Schwidefsky 提交于
      The queued spinlock code for s390 follows the principles of the common
      code qspinlock implementation but with a few notable differences.
      
      The format of the spinlock_t locking word differs, s390 needs to store
      the logical CPU number of the lock holder in the spinlock_t to be able
      to use the diagnose 9c directed yield hypervisor call.
      
      The inline code sequences for spin_lock and spin_unlock are nice and
      short. The inline portion of a spin_lock now typically looks like this:
      
      	lhi	%r0,0			# 0 indicates an empty lock
      	l	%r1,0x3a0		# CPU number + 1 from lowcore
      	cs	%r0,%r1,<some_lock>	# lock operation
      	jnz	call_wait		# on failure call wait function
      locked:
      	...
      call_wait:
      	la	%r2,<some_lock>
      	brasl	%r14,arch_spin_lock_wait
      	j	locked
      
      A spin_unlock is as simple as before:
      
      	lhi	%r0,0
      	sth	%r0,2(%r2)		# unlock operation
      
      After a CPU has queued itself it may not enable interrupts again for the
      arch_spin_lock_flags() variant. The arch_spin_lock_wait_flags wait function
      is removed.
      
      To improve performance the code implements opportunistic lock stealing.
      If the wait function finds a spinlock_t that indicates that the lock is
      free but there are queued waiters, the CPU may steal the lock up to three
      times without queueing itself. The lock stealing update the steal counter
      in the lock word to prevent more than 3 steals. The counter is reset at
      the time the CPU next in the queue successfully takes the lock.
      
      While the queued spinlocks improve performance in a system with dedicated
      CPUs, in a virtualized environment with continuously overcommitted CPUs
      the queued spinlocks can have a negative effect on performance. This
      is due to the fact that a queued CPU that is preempted by the hypervisor
      will block the queue at some point even without holding the lock. With
      the classic spinlock it does not matter if a CPU is preempted that waits
      for the lock. Therefore use the queued spinlock code only if the system
      runs with dedicated CPUs and fall back to classic spinlocks when running
      with shared CPUs.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b96f7d88
  2. 09 8月, 2017 1 次提交
    • H
      s390/vmcp: make use of contiguous memory allocator · 3f429842
      Heiko Carstens 提交于
      If memory is fragmented it is unlikely that large order memory
      allocations succeed. This has been an issue with the vmcp device
      driver since a long time, since it requires large physical contiguous
      memory ares for large responses.
      
      To hopefully resolve this issue make use of the contiguous memory
      allocator (cma). This patch adds a vmcp specific vmcp cma area with a
      default size of 4MB. The size can be changed either via the
      VMCP_CMA_SIZE config option at compile time or with the "vmcp_cma"
      kernel parameter (e.g. "vmcp_cma=16m").
      
      For any vmcp response buffers larger than 16k memory from the cma area
      will be allocated. If such an allocation fails, there is a fallback to
      the buddy allocator.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      3f429842
  3. 26 7月, 2017 3 次提交
  4. 13 7月, 2017 1 次提交
  5. 22 3月, 2017 1 次提交
    • M
      s390: add a system call for guarded storage · 916cda1a
      Martin Schwidefsky 提交于
      This adds a new system call to enable the use of guarded storage for
      user space processes. The system call takes two arguments, a command
      and pointer to a guarded storage control block:
      
          s390_guarded_storage(int command, struct gs_cb *gs_cb);
      
      The second argument is relevant only for the GS_SET_BC_CB command.
      
      The commands in detail:
      
      0 - GS_ENABLE
          Enable the guarded storage facility for the current task. The
          initial content of the guarded storage control block will be
          all zeros. After the enablement the user space code can use
          load-guarded-storage-controls instruction (LGSC) to load an
          arbitrary control block. While a task is enabled the kernel
          will save and restore the current content of the guarded
          storage registers on context switch.
      1 - GS_DISABLE
          Disables the use of the guarded storage facility for the current
          task. The kernel will cease to save and restore the content of
          the guarded storage registers, the task specific content of
          these registers is lost.
      2 - GS_SET_BC_CB
          Set a broadcast guarded storage control block. This is called
          per thread and stores a specific guarded storage control block
          in the task struct of the current task. This control block will
          be used for the broadcast event GS_BROADCAST.
      3 - GS_CLEAR_BC_CB
          Clears the broadcast guarded storage control block. The guarded-
          storage control block is removed from the task struct that was
          established by GS_SET_BC_CB.
      4 - GS_BROADCAST
          Sends a broadcast to all thread siblings of the current task.
          Every sibling that has established a broadcast guarded storage
          control block will load this control block and will be enabled
          for guarded storage. The broadcast guarded storage control block
          is used up, a second broadcast without a refresh of the stored
          control block with GS_SET_BC_CB will not have any effect.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      916cda1a
  6. 02 3月, 2017 2 次提交
  7. 17 2月, 2017 1 次提交
  8. 08 2月, 2017 2 次提交
  9. 07 2月, 2017 1 次提交
  10. 16 1月, 2017 1 次提交
  11. 14 12月, 2016 1 次提交
  12. 07 12月, 2016 2 次提交
    • H
      s390/numa: establish cpu to node mapping early · 8c910580
      Heiko Carstens 提交于
      Initialize the cpu topology and therefore also the cpu to node mapping
      much earlier. Fixes this warning and subsequent crashes when using the
      fake numa emulation mode on s390:
      
      WARNING: CPU: 0 PID: 1 at include/linux/cpumask.h:121 select_task_rq+0xe6/0x1a8
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00001-ge9d867a6-dirty #28
      task: 00000001dd270008 ti: 00000001eccb4000 task.ti: 00000001eccb4000
      Krnl PSW : 0404c00180000000 0000000000176c56 (select_task_rq+0xe6/0x1a8)
                 R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
      Call Trace:
      ([<0000000000176c30>] select_task_rq+0xc0/0x1a8)
      ([<0000000000177d64>] try_to_wake_up+0x2e4/0x478)
      ([<000000000015d46c>] create_worker+0x174/0x1c0)
      ([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438)
      ([<0000000000162550>] apply_wqattrs_prepare+0x200/0x2a0)
      ([<000000000016266a>] apply_workqueue_attrs_locked+0x7a/0xb0)
      ([<0000000000162af0>] apply_workqueue_attrs+0x50/0x78)
      ([<000000000016441c>] __alloc_workqueue_key+0x304/0x520)
      ([<0000000000ee3706>] default_bdi_init+0x3e/0x70)
      ([<0000000000100270>] do_one_initcall+0x140/0x1d8)
      ([<0000000000ec9da8>] kernel_init_freeable+0x220/0x2d8)
      ([<0000000000984a7a>] kernel_init+0x2a/0x150)
      ([<00000000009913fa>] kernel_thread_starter+0x6/0xc)
      ([<00000000009913f4>] kernel_thread_starter+0x0/0xc)
      Reviewed-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      8c910580
    • H
      s390/smp: initialize cpu_present_mask in setup_arch · af51160e
      Heiko Carstens 提交于
      In order to be able to setup the cpu to node mappings early it is a
      prerequisite to know which cpus are present. Therefore cpus must be
      detected much earlier than before.
      
      For sclp based cpu detection this requires yet another early sclp
      call, since the system is not ready to use the regular interrupt and
      memory allocations.
      Reviewed-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      af51160e
  13. 02 12月, 2016 2 次提交
    • H
      s390/setup: fix memblock usage · db7ad636
      Heiko Carstens 提交于
      When converting from bootmem to memblock I missed a subtle difference:
      the memblock_alloc() functions return uninitialized memory, while the
      memblock_virt_alloc() functions return zeroed memory.
      
      This led to quite random early boot crashes.
      
      Therefore use the correct version everywhere now.
      Hopefully.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      db7ad636
    • H
      s390/kexec: use node 0 when re-adding crash kernel memory · 9f88eb4d
      Heiko Carstens 提交于
      When re-adding crash kernel memory within setup_resources() the
      function memblock_add() is used. That function will add memory by
      default to node "MAX_NUMNODES" instead of node 0, like the memory
      detection code does. In case of !NUMA this will trigger this warning
      when the kernel generates the vmemmap:
      
      Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
      WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16
      Call Trace:
       [<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8
       [<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50
       [<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0
       [<0000000000840136>] sparse_mem_map_populate+0x46/0x68
       [<0000000000d0c59c>] sparse_init+0x184/0x238
       [<0000000000cf45f6>] paging_init+0xbe/0xf8
       [<0000000000cf1d4a>] setup_arch+0xa02/0xae0
       [<0000000000ced75a>] start_kernel+0x72/0x450
       [<0000000000100020>] _stext+0x20/0x80
      
      If NUMA is selected numa_setup_memory() will fix the node assignments
      before the vmemmap will be populated; so this warning will only appear
      if NUMA is not selected.
      
      To fix this simply use memblock_add_node() and re-add crash kernel
      memory explicitly to node 0.
      Reported-and-tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Fixes: 4e042af4 ("s390/kexec: fix crash on resize of reserved memory")
      Cc: <stable@vger.kernel.org> # v4.8+
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      9f88eb4d
  14. 29 11月, 2016 1 次提交
  15. 23 11月, 2016 1 次提交
  16. 11 11月, 2016 2 次提交
  17. 27 8月, 2016 1 次提交
  18. 13 7月, 2016 1 次提交
  19. 13 6月, 2016 3 次提交
  20. 21 4月, 2016 1 次提交
    • M
      s390/fpu: allocate 'struct fpu' with the task_struct · 3f6813b9
      Martin Schwidefsky 提交于
      Analog to git commit 0c8c0f03
      "x86/fpu, sched: Dynamically allocate 'struct fpu'"
      move the struct fpu to the end of the struct thread_struct,
      set CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and add the
      setup_task_size() function to calculate the correct size
      fo the task struct.
      
      For the performance_defconfig this increases the size of
      struct task_struct from 7424 bytes to 7936 bytes (MACHINE_HAS_VX==1)
      or 7552 bytes (MACHINE_HAS_VX==0). The dynamic allocation of the
      struct fpu is removed. The slab cache uses an 8KB block for the
      task struct in all cases, there is enough room for the struct fpu.
      For MACHINE_HAS_VX==1 each task now needs 512 bytes less memory.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      3f6813b9
  21. 10 3月, 2016 1 次提交
  22. 24 2月, 2016 1 次提交
  23. 30 1月, 2016 1 次提交
    • T
      arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM · 35d98e93
      Toshi Kani 提交于
      Set IORESOURCE_SYSTEM_RAM in flags of resource ranges with
      "System RAM", "Kernel code", "Kernel data", and "Kernel bss".
      
      Note that:
      
       - IORESOURCE_SYSRAM (i.e. modifier bit) is set in flags when
         IORESOURCE_MEM is already set. IORESOURCE_SYSTEM_RAM is defined
         as (IORESOURCE_MEM|IORESOURCE_SYSRAM).
      
       - Some archs do not set 'flags' for children nodes, such as
         "Kernel code".  This patch does not change 'flags' in this
         case.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: sparclinux@vger.kernel.org
      Link: http://lkml.kernel.org/r/1453841853-11383-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      35d98e93
  24. 19 1月, 2016 1 次提交
  25. 11 1月, 2016 1 次提交
  26. 30 11月, 2015 1 次提交
  27. 27 11月, 2015 2 次提交
  28. 16 11月, 2015 1 次提交
  29. 19 8月, 2015 1 次提交
  30. 04 8月, 2015 1 次提交