1. 23 4月, 2015 1 次提交
    • C
      sparc64: Setup sysfs to mark LDOM sockets, cores and threads correctly · acc455cf
      chris hyser 提交于
      commit 5f4826a362405748bbf73957027b77993e61e1af
      Author: chris hyser <chris.hyser@oracle.com>
      Date:   Tue Apr 21 10:31:38 2015 -0400
      
          sparc64: Setup sysfs to mark LDOM sockets, cores and threads correctly
      
          The current sparc kernel has no representation for sockets though tools
          like lscpu can pull this from sysfs. This patch walks the machine
          description cache and socket hierarchy and marks sockets as well as cores
          and threads such that a representative sysfs is created by
          drivers/base/topology.c.
      
          Before this patch:
              $ lscpu
              Architecture:          sparc64
              CPU op-mode(s):        32-bit, 64-bit
              Byte Order:            Big Endian
              CPU(s):                1024
              On-line CPU(s) list:   0-1023
              Thread(s) per core:    8
              Core(s) per socket:    1     <--- wrong
              Socket(s):             128   <--- wrong
              NUMA node(s):          4
              NUMA node0 CPU(s):     0-255
              NUMA node1 CPU(s):     256-511
              NUMA node2 CPU(s):     512-767
              NUMA node3 CPU(s):     768-1023
      
              After this patch:
              $ lscpu
              Architecture:          sparc64
              CPU op-mode(s):        32-bit, 64-bit
              Byte Order:            Big Endian
              CPU(s):                1024
              On-line CPU(s) list:   0-1023
              Thread(s) per core:    8
              Core(s) per socket:    32
              Socket(s):             4
              NUMA node(s):          4
              NUMA node0 CPU(s):     0-255
              NUMA node1 CPU(s):     256-511
              NUMA node2 CPU(s):     512-767
              NUMA node3 CPU(s):     768-1023
      
          Most of this patch was done by Chris with updates by David.
      Signed-off-by: NChris Hyser <chris.hyser@oracle.com>
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acc455cf
  2. 19 4月, 2015 2 次提交
  3. 17 4月, 2015 2 次提交
  4. 13 4月, 2015 2 次提交
  5. 09 4月, 2015 1 次提交
    • A
      jump_label: Allow asm/jump_label.h to be included in assembly · 55dd0df7
      Anton Blanchard 提交于
      Wrap asm/jump_label.h for all archs with #ifndef __ASSEMBLY__.
      Since these are kernel only headers, we don't need #ifdef
      __KERNEL__ so can simplify things a bit.
      
      If an architecture wants to use jump labels in assembly, it
      will still need to define a macro to create the __jump_table
      entries (see ARCH_STATIC_BRANCH in the powerpc asm/jump_label.h
      for an example).
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: benh@kernel.crashing.org
      Cc: catalin.marinas@arm.com
      Cc: davem@davemloft.net
      Cc: heiko.carstens@de.ibm.com
      Cc: jbaron@akamai.com
      Cc: linux@arm.linux.org.uk
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: liuj97@gmail.com
      Cc: mgorman@suse.de
      Cc: mmarek@suse.cz
      Cc: mpe@ellerman.id.au
      Cc: paulus@samba.org
      Cc: ralf@linux-mips.org
      Cc: rostedt@goodmis.org
      Cc: schwidefsky@de.ibm.com
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1428551492-21977-1-git-send-email-anton@samba.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      55dd0df7
  6. 20 3月, 2015 1 次提交
  7. 02 3月, 2015 2 次提交
  8. 13 2月, 2015 1 次提交
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
  9. 12 2月, 2015 2 次提交
  10. 11 2月, 2015 1 次提交
  11. 06 2月, 2015 1 次提交
  12. 13 1月, 2015 6 次提交
  13. 14 12月, 2014 1 次提交
  14. 12 12月, 2014 4 次提交
    • D
      1678c2bd
    • D
      vio: create routines for inc,dec vio dring indexes · fe47c3c2
      Dwight Engen 提交于
      Both sunvdc and sunvnet implemented distinct functionality for incrementing
      and decrementing dring indexes. Create common functions for use by both
      from the sunvnet versions, which were chosen since they will still work
      correctly in case a non power of two ring size is used.
      Signed-off-by: NDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe47c3c2
    • A
      arch: Add lightweight memory barriers dma_rmb() and dma_wmb() · 1077fa36
      Alexander Duyck 提交于
      There are a number of situations where the mandatory barriers rmb() and
      wmb() are used to order memory/memory operations in the device drivers
      and those barriers are much heavier than they actually need to be.  For
      example in the case of PowerPC wmb() calls the heavy-weight sync
      instruction when for coherent memory operations all that is really needed
      is an lsync or eieio instruction.
      
      This commit adds a coherent only version of the mandatory memory barriers
      rmb() and wmb().  In most cases this should result in the barrier being the
      same as the SMP barriers for the SMP case, however in some cases we use a
      barrier that is somewhere in between rmb() and smp_rmb().  For example on
      ARM the rmb barriers break down as follows:
      
        Barrier   Call     Explanation
        --------- -------- ----------------------------------
        rmb()     dsb()    Data synchronization barrier - system
        dma_rmb() dmb(osh) data memory barrier - outer sharable
        smp_rmb() dmb(ish) data memory barrier - inner sharable
      
      These new barriers are not as safe as the standard rmb() and wmb().
      Specifically they do not guarantee ordering between coherent and incoherent
      memories.  The primary use case for these would be to enforce ordering of
      reads and writes when accessing coherent memory that is shared between the
      CPU and a device.
      
      It may also be noted that there is no dma_mb().  Most architectures don't
      provide a good mechanism for performing a coherent only full barrier without
      resorting to the same mechanism used in mb().  As such there isn't much to
      be gained in trying to define such a function.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1077fa36
    • A
      arch: Cleanup read_barrier_depends() and comments · 8a449718
      Alexander Duyck 提交于
      This patch is meant to cleanup the handling of read_barrier_depends and
      smp_read_barrier_depends.  In multiple spots in the kernel headers
      read_barrier_depends is defined as "do {} while (0)", however we then go
      into the SMP vs non-SMP sections and have the SMP version reference
      read_barrier_depends, and the non-SMP define it as yet another empty
      do/while.
      
      With this commit I went through and cleaned out the duplicate definitions
      and reduced the number of definitions down to 2 per header.  In addition I
      moved the 50 line comments for the macro from the x86 and mips headers that
      defined it as an empty do/while to those that were actually defining the
      macro, alpha and blackfin.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a449718
  15. 11 12月, 2014 2 次提交
  16. 09 12月, 2014 1 次提交
  17. 06 12月, 2014 1 次提交
    • A
      net: sock: allow eBPF programs to be attached to sockets · 89aa0758
      Alexei Starovoitov 提交于
      introduce new setsockopt() command:
      
      setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd))
      
      where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...)
      and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER
      
      setsockopt() calls bpf_prog_get() which increments refcnt of the program,
      so it doesn't get unloaded while socket is using the program.
      
      The same eBPF program can be attached to multiple sockets.
      
      User task exit automatically closes socket which calls sk_filter_uncharge()
      which decrements refcnt of eBPF program
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89aa0758
  18. 27 11月, 2014 1 次提交
  19. 19 11月, 2014 1 次提交
    • A
      sparc: io: remove duplicate relaxed accessors on sparc32 · 7c3969c3
      Arnd Bergmann 提交于
      Commit 1191ccb3 ("sparc: io: implement dummy relaxed accessor
      macros for writes") added the relaxed accessors (readl_relaxed etc) in
      a file that is shared between sparc32 and sparc64. However, the earlier
      e1039fb4 ("sparc32: introduce asm-generic/io.h") had already changed
      the sparc32 implementation to use asm-generic/io.h, which provides the
      same macros, resulting in lots of build errors.
      
      This moves the definitions from the shared sparc file into the
      sparc64-only file to fix the sparc32 build regression.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Fixes: 1191ccb3 ("sparc: io: implement dummy relaxed accessor macros for writes")
      7c3969c3
  20. 17 11月, 2014 1 次提交
  21. 12 11月, 2014 1 次提交
    • E
      net: introduce SO_INCOMING_CPU · 2c8c56e1
      Eric Dumazet 提交于
      Alternative to RPS/RFS is to use hardware support for multiple
      queues.
      
      Then split a set of million of sockets into worker threads, each
      one using epoll() to manage events on its own socket pool.
      
      Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
      know after accept() or connect() on which queue/cpu a socket is managed.
      
      We normally use one cpu per RX queue (IRQ smp_affinity being properly
      set), so remembering on socket structure which cpu delivered last packet
      is enough to solve the problem.
      
      After accept(), connect(), or even file descriptor passing around
      processes, applications can use :
      
       int cpu;
       socklen_t len = sizeof(cpu);
      
       getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
      
      And use this information to put the socket into the right silo
      for optimal performance, as all networking stack should run
      on the appropriate cpu, without need to send IPI (RPS/RFS).
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c8c56e1
  22. 08 11月, 2014 1 次提交
  23. 29 10月, 2014 1 次提交
  24. 25 10月, 2014 1 次提交
    • D
      sparc64: Fix register corruption in top-most kernel stack frame during boot. · ef3e035c
      David S. Miller 提交于
      Meelis Roos reported that kernels built with gcc-4.9 do not boot, we
      eventually narrowed this down to only impacting machines using
      UltraSPARC-III and derivitive cpus.
      
      The crash happens right when the first user process is spawned:
      
      [   54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
      [   54.451346]
      [   54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab7 #96
      [   54.666431] Call Trace:
      [   54.698453]  [0000000000762f8c] panic+0xb0/0x224
      [   54.759071]  [000000000045cf68] do_exit+0x948/0x960
      [   54.823123]  [000000000042cbc0] fault_in_user_windows+0xe0/0x100
      [   54.902036]  [0000000000404ad0] __handle_user_windows+0x0/0x10
      [   54.978662] Press Stop-A (L1-A) to return to the boot prom
      [   55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
      
      Further investigation showed that compiling only per_cpu_patch() with
      an older compiler fixes the boot.
      
      Detailed analysis showed that the function is not being miscompiled by
      gcc-4.9, but it is using a different register allocation ordering.
      
      With the gcc-4.9 compiled function, something during the code patching
      causes some of the %i* input registers to get corrupted.  Perhaps
      we have a TLB miss path into the firmware that is deep enough to
      cause a register window spill and subsequent restore when we get
      back from the TLB miss trap.
      
      Let's plug this up by doing two things:
      
      1) Stop using the firmware stack for client interface calls into
         the firmware.  Just use the kernel's stack.
      
      2) As soon as we can, call into a new function "start_early_boot()"
         to put a one-register-window buffer between the firmware's
         deepest stack frame and the top-most initial kernel one.
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef3e035c
  25. 21 10月, 2014 1 次提交
  26. 20 10月, 2014 1 次提交