1. 10 9月, 2017 1 次提交
    • A
      sparc64: speed up etrap/rtrap on NG2 and later processors · a7159a87
      Anthony Yznaga 提交于
      For many sun4v processor types, reading or writing a privileged register
      has a latency of 40 to 70 cycles.  Use a combination of the low-latency
      allclean, otherw, normalw, and nop instructions in etrap and rtrap to
      replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap
      performance.  allclean, otherw, and normalw are available on NG2 and
      later processors.
      
      The average ticks to execute the flush windows trap ("ta 0x3") with and
      without this patch on select platforms:
      
       CPU            Not patched     Patched    % Latency Reduction
      
       NG2            1762            1558            -11.58
       NG4            3619            3204            -11.47
       M7             3015            2624            -12.97
       SPARC64-X      829             770              -7.12
      Signed-off-by: NAnthony Yznaga <anthony.yznaga@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7159a87
  2. 09 9月, 2017 1 次提交
    • M
      vga: optimise console scrolling · ac036f95
      Matthew Wilcox 提交于
      Where possible, call memset16(), memmove() or memcpy() instead of using
      open-coded loops.  I don't like the calling convention that uses a byte
      count instead of a count of u16s, but it's a little late to change that.
      Reduces code size of fbcon.o by almost 400 bytes on my laptop build.
      
      [akpm@linux-foundation.org: fix build]
      Link: http://lkml.kernel.org/r/20170720184539.31609-9-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac036f95
  3. 26 8月, 2017 1 次提交
    • J
      futex: Remove duplicated code and fix undefined behaviour · 30d6e0a4
      Jiri Slaby 提交于
      There is code duplicated over all architecture's headers for
      futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
      and comparison of the result.
      
      Remove this duplication and leave up to the arches only the needed
      assembly which is now in arch_futex_atomic_op_inuser.
      
      This effectively distributes the Will Deacon's arm64 fix for undefined
      behaviour reported by UBSAN to all architectures. The fix was done in
      commit 5f16a046 (arm64: futex: Fix undefined behaviour with
      FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
      
      And as suggested by Thomas, check for negative oparg too, because it was
      also reported to cause undefined behaviour report.
      
      Note that s390 removed access_ok check in d12a2970 ("s390/uaccess:
      remove pointless access_ok() checks") as access_ok there returns true.
      We introduce it back to the helper for the sake of simplicity (it gets
      optimized away anyway).
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Reviewed-by: NDarren Hart (VMware) <dvhart@infradead.org>
      Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
      Cc: linux-mips@linux-mips.org
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: sparclinux@vger.kernel.org
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-hexagon@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-xtensa@linux-xtensa.org
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: openrisc@lists.librecores.org
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-parisc@vger.kernel.org
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
      30d6e0a4
  4. 17 8月, 2017 1 次提交
  5. 16 8月, 2017 3 次提交
  6. 11 8月, 2017 2 次提交
  7. 10 8月, 2017 5 次提交
  8. 05 8月, 2017 2 次提交
  9. 19 7月, 2017 1 次提交
    • R
      sparc64: Prevent perf from running during super critical sections · fc290a11
      Rob Gardner 提交于
      This fixes another cause of random segfaults and bus errors that may
      occur while running perf with the callgraph option.
      
      Critical sections beginning with spin_lock_irqsave() raise the interrupt
      level to PIL_NORMAL_MAX (14) and intentionally do not block performance
      counter interrupts, which arrive at PIL_NMI (15).
      
      But some sections of code are "super critical" with respect to perf
      because the perf_callchain_user() path accesses user space and may cause
      TLB activity as well as faults as it unwinds the user stack.
      
      One particular critical section occurs in switch_mm:
      
              spin_lock_irqsave(&mm->context.lock, flags);
              ...
              load_secondary_context(mm);
              tsb_context_switch(mm);
              ...
              spin_unlock_irqrestore(&mm->context.lock, flags);
      
      If a perf interrupt arrives in between load_secondary_context() and
      tsb_context_switch(), then perf_callchain_user() could execute with
      the context ID of one process, but with an active TSB for a different
      process. When the user stack is accessed, it is very likely to
      incur a TLB miss, since the h/w context ID has been changed. The TLB
      will then be reloaded with a translation from the TSB for one process,
      but using a context ID for another process. This exposes memory from
      one process to another, and since it is a mapping for stack memory,
      this usually causes the new process to crash quickly.
      
      This super critical section needs more protection than is provided
      by spin_lock_irqsave() since perf interrupts must not be allowed in.
      
      Since __tsb_context_switch already goes through the trouble of
      disabling interrupts completely, we fix this by moving the secondary
      context load down into this better protected region.
      
      Orabug: 25577560
      Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc290a11
  10. 15 7月, 2017 1 次提交
    • J
      sparc64: Measure receiver forward progress to avoid send mondo timeout · 9d53caec
      Jane Chu 提交于
      A large sun4v SPARC system may have moments of intensive xcall activities,
      usually caused by unmapping many pages on many CPUs concurrently. This can
      flood receivers with CPU mondo interrupts for an extended period, causing
      some unlucky senders to hit send-mondo timeout. This problem gets worse
      as cpu count increases because sometimes mappings must be invalidated on
      all CPUs, and sometimes all CPUs may gang up on a single CPU.
      
      But a busy system is not a broken system. In the above scenario, as long
      as the receiver is making forward progress processing mondo interrupts,
      the sender should continue to retry.
      
      This patch implements the receiver's forward progress meter by introducing
      a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range
      of 0..NR_CPUS. The receiver increments its counter as soon as it receives
      a mondo and the sender tracks the receiver's counter. If the receiver has
      stopped making forward progress when the retry limit is reached, the sender
      declares send-mondo-timeout and panic; otherwise, the receiver is allowed
      to keep making forward progress.
      
      In addition, it's been observed that PCIe hotplug events generate Correctable
      Errors that are handled by hypervisor and then OS. Hypervisor 'borrows'
      a guest cpu strand briefly to provide the service. If the cpu strand is
      simultaneously the only cpu targeted by a mondo, it may not be available
      for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second
      is the agreed wait time between hypervisor and guest OS, this patch makes
      the adjustment.
      
      Orabug: 25476541
      Orabug: 26417466
      Signed-off-by: NJane Chu <jane.chu@oracle.com>
      Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
      Reviewed-by: NAnthony Yznaga <anthony.yznaga@oracle.com>
      Reviewed-by: NRob Gardner <rob.gardner@oracle.com>
      Reviewed-by: NThomas Tai <thomas.tai@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d53caec
  11. 13 7月, 2017 1 次提交
  12. 11 7月, 2017 1 次提交
  13. 04 7月, 2017 1 次提交
  14. 29 6月, 2017 1 次提交
  15. 28 6月, 2017 3 次提交
  16. 26 6月, 2017 5 次提交
  17. 20 6月, 2017 1 次提交
  18. 13 6月, 2017 4 次提交
  19. 11 6月, 2017 1 次提交
  20. 07 6月, 2017 4 次提交