1. 10 9月, 2017 1 次提交
    • A
      sparc64: speed up etrap/rtrap on NG2 and later processors · a7159a87
      Anthony Yznaga 提交于
      For many sun4v processor types, reading or writing a privileged register
      has a latency of 40 to 70 cycles.  Use a combination of the low-latency
      allclean, otherw, normalw, and nop instructions in etrap and rtrap to
      replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap
      performance.  allclean, otherw, and normalw are available on NG2 and
      later processors.
      
      The average ticks to execute the flush windows trap ("ta 0x3") with and
      without this patch on select platforms:
      
       CPU            Not patched     Patched    % Latency Reduction
      
       NG2            1762            1558            -11.58
       NG4            3619            3204            -11.47
       M7             3015            2624            -12.97
       SPARC64-X      829             770              -7.12
      Signed-off-by: NAnthony Yznaga <anthony.yznaga@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7159a87
  2. 09 9月, 2017 1 次提交
    • M
      vga: optimise console scrolling · ac036f95
      Matthew Wilcox 提交于
      Where possible, call memset16(), memmove() or memcpy() instead of using
      open-coded loops.  I don't like the calling convention that uses a byte
      count instead of a count of u16s, but it's a little late to change that.
      Reduces code size of fbcon.o by almost 400 bytes on my laptop build.
      
      [akpm@linux-foundation.org: fix build]
      Link: http://lkml.kernel.org/r/20170720184539.31609-9-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac036f95
  3. 26 8月, 2017 1 次提交
    • J
      futex: Remove duplicated code and fix undefined behaviour · 30d6e0a4
      Jiri Slaby 提交于
      There is code duplicated over all architecture's headers for
      futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
      and comparison of the result.
      
      Remove this duplication and leave up to the arches only the needed
      assembly which is now in arch_futex_atomic_op_inuser.
      
      This effectively distributes the Will Deacon's arm64 fix for undefined
      behaviour reported by UBSAN to all architectures. The fix was done in
      commit 5f16a046 (arm64: futex: Fix undefined behaviour with
      FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
      
      And as suggested by Thomas, check for negative oparg too, because it was
      also reported to cause undefined behaviour report.
      
      Note that s390 removed access_ok check in d12a2970 ("s390/uaccess:
      remove pointless access_ok() checks") as access_ok there returns true.
      We introduce it back to the helper for the sake of simplicity (it gets
      optimized away anyway).
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Reviewed-by: NDarren Hart (VMware) <dvhart@infradead.org>
      Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
      Cc: linux-mips@linux-mips.org
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: sparclinux@vger.kernel.org
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-hexagon@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-xtensa@linux-xtensa.org
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: openrisc@lists.librecores.org
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-parisc@vger.kernel.org
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
      30d6e0a4
  4. 17 8月, 2017 1 次提交
  5. 16 8月, 2017 3 次提交
  6. 11 8月, 2017 2 次提交
  7. 10 8月, 2017 5 次提交
  8. 05 8月, 2017 2 次提交
  9. 04 8月, 2017 1 次提交
  10. 19 7月, 2017 1 次提交
    • R
      sparc64: Prevent perf from running during super critical sections · fc290a11
      Rob Gardner 提交于
      This fixes another cause of random segfaults and bus errors that may
      occur while running perf with the callgraph option.
      
      Critical sections beginning with spin_lock_irqsave() raise the interrupt
      level to PIL_NORMAL_MAX (14) and intentionally do not block performance
      counter interrupts, which arrive at PIL_NMI (15).
      
      But some sections of code are "super critical" with respect to perf
      because the perf_callchain_user() path accesses user space and may cause
      TLB activity as well as faults as it unwinds the user stack.
      
      One particular critical section occurs in switch_mm:
      
              spin_lock_irqsave(&mm->context.lock, flags);
              ...
              load_secondary_context(mm);
              tsb_context_switch(mm);
              ...
              spin_unlock_irqrestore(&mm->context.lock, flags);
      
      If a perf interrupt arrives in between load_secondary_context() and
      tsb_context_switch(), then perf_callchain_user() could execute with
      the context ID of one process, but with an active TSB for a different
      process. When the user stack is accessed, it is very likely to
      incur a TLB miss, since the h/w context ID has been changed. The TLB
      will then be reloaded with a translation from the TSB for one process,
      but using a context ID for another process. This exposes memory from
      one process to another, and since it is a mapping for stack memory,
      this usually causes the new process to crash quickly.
      
      This super critical section needs more protection than is provided
      by spin_lock_irqsave() since perf interrupts must not be allowed in.
      
      Since __tsb_context_switch already goes through the trouble of
      disabling interrupts completely, we fix this by moving the secondary
      context load down into this better protected region.
      
      Orabug: 25577560
      Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc290a11
  11. 17 7月, 2017 1 次提交
  12. 15 7月, 2017 1 次提交
    • J
      sparc64: Measure receiver forward progress to avoid send mondo timeout · 9d53caec
      Jane Chu 提交于
      A large sun4v SPARC system may have moments of intensive xcall activities,
      usually caused by unmapping many pages on many CPUs concurrently. This can
      flood receivers with CPU mondo interrupts for an extended period, causing
      some unlucky senders to hit send-mondo timeout. This problem gets worse
      as cpu count increases because sometimes mappings must be invalidated on
      all CPUs, and sometimes all CPUs may gang up on a single CPU.
      
      But a busy system is not a broken system. In the above scenario, as long
      as the receiver is making forward progress processing mondo interrupts,
      the sender should continue to retry.
      
      This patch implements the receiver's forward progress meter by introducing
      a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range
      of 0..NR_CPUS. The receiver increments its counter as soon as it receives
      a mondo and the sender tracks the receiver's counter. If the receiver has
      stopped making forward progress when the retry limit is reached, the sender
      declares send-mondo-timeout and panic; otherwise, the receiver is allowed
      to keep making forward progress.
      
      In addition, it's been observed that PCIe hotplug events generate Correctable
      Errors that are handled by hypervisor and then OS. Hypervisor 'borrows'
      a guest cpu strand briefly to provide the service. If the cpu strand is
      simultaneously the only cpu targeted by a mondo, it may not be available
      for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second
      is the agreed wait time between hypervisor and guest OS, this patch makes
      the adjustment.
      
      Orabug: 25476541
      Orabug: 26417466
      Signed-off-by: NJane Chu <jane.chu@oracle.com>
      Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
      Reviewed-by: NAnthony Yznaga <anthony.yznaga@oracle.com>
      Reviewed-by: NRob Gardner <rob.gardner@oracle.com>
      Reviewed-by: NThomas Tai <thomas.tai@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d53caec
  13. 13 7月, 2017 1 次提交
  14. 11 7月, 2017 1 次提交
  15. 04 7月, 2017 1 次提交
  16. 29 6月, 2017 1 次提交
  17. 28 6月, 2017 3 次提交
  18. 26 6月, 2017 5 次提交
  19. 21 6月, 2017 1 次提交
    • D
      net: introduce SO_PEERGROUPS getsockopt · 28b5ba2a
      David Herrmann 提交于
      This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to
      retrieve the auxiliary groups of the remote peer. It is designed to
      naturally extend SO_PEERCRED. That is, the underlying data is from the
      same credentials. Regarding its syntax, it is based on SO_PEERSEC. That
      is, if the provided buffer is too small, ERANGE is returned and @optlen
      is updated. Otherwise, the information is copied, @optlen is set to the
      actual size, and 0 is returned.
      
      While SO_PEERCRED (and thus `struct ucred') already returns the primary
      group, it lacks the auxiliary group vector. However, nearly all access
      controls (including kernel side VFS and SYSVIPC, but also user-space
      polkit, DBus, ...) consider the entire set of groups, rather than just
      the primary group. But this is currently not possible with pure
      SO_PEERCRED. Instead, user-space has to work around this and query the
      system database for the auxiliary groups of a UID retrieved via
      SO_PEERCRED.
      
      Unfortunately, there is no race-free way to query the auxiliary groups
      of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space
      solution is to use getgrouplist(3p), which itself falls back to NSS and
      whatever is configured in nsswitch.conf(3). This effectively checks
      which groups we *would* assign to the user if it logged in *now*. On
      normal systems it is as easy as reading /etc/group, but with NSS it can
      resort to quering network databases (eg., LDAP), using IPC or network
      communication.
      
      Long story short: Whenever we want to use auxiliary groups for access
      checks on IPC, we need further IPC to talk to the user/group databases,
      rather than just relying on SO_PEERCRED and the incoming socket. This
      is unfortunate, and might even result in dead-locks if the database
      query uses the same IPC as the original request.
      
      So far, those recursions / dead-locks have been avoided by using
      primitive IPC for all crucial NSS modules. However, we want to avoid
      re-inventing the wheel for each NSS module that might be involved in
      user/group queries. Hence, we would preferably make DBus (and other IPC
      that supports access-management based on groups) work without resorting
      to the user/group database. This new SO_PEERGROUPS ioctl would allow us
      to make dbus-daemon work without ever calling into NSS.
      
      Cc: Michal Sekletar <msekleta@redhat.com>
      Cc: Simon McVittie <simon.mcvittie@collabora.co.uk>
      Reviewed-by: NTom Gundersen <teg@jklm.no>
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28b5ba2a
  20. 20 6月, 2017 1 次提交
  21. 13 6月, 2017 4 次提交
  22. 11 6月, 2017 1 次提交
  23. 09 6月, 2017 1 次提交
    • A
      tty: add TIOCGPTPEER ioctl · 54ebbfb1
      Aleksa Sarai 提交于
      When opening the slave end of a PTY, it is not possible for userspace to
      safely ensure that /dev/pts/$num is actually a slave (in cases where the
      mount namespace in which devpts was mounted is controlled by an
      untrusted process). In addition, there are several unresolvable
      race conditions if userspace were to attempt to detect attacks through
      stat(2) and other similar methods [in addition it is not clear how
      userspace could detect attacks involving FUSE].
      
      Resolve this by providing an interface for userpace to safely open the
      "peer" end of a PTY file descriptor by using the dentry cached by
      devpts. Since it is not possible to have an open master PTY without
      having its slave exposed in /dev/pts this interface is safe. This
      interface currently does not provide a way to get the master pty (since
      it is not clear whether such an interface is safe or even useful).
      
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: Valentin Rothberg <vrothberg@suse.com>
      Signed-off-by: NAleksa Sarai <asarai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54ebbfb1