1. 17 2月, 2015 4 次提交
  2. 14 2月, 2015 1 次提交
    • A
      mm: vmalloc: pass additional vm_flags to __vmalloc_node_range() · cb9e3c29
      Andrey Ryabinin 提交于
      For instrumenting global variables KASan will shadow memory backing memory
      for modules.  So on module loading we will need to allocate memory for
      shadow and map it at address in shadow that corresponds to the address
      allocated in module_alloc().
      
      __vmalloc_node_range() could be used for this purpose, except it puts a
      guard hole after allocated area.  Guard hole in shadow memory should be a
      problem because at some future point we might need to have a shadow memory
      at address occupied by guard hole.  So we could fail to allocate shadow
      for module_alloc().
      
      Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
      __vmalloc_node_range().  Add new parameter 'vm_flags' to
      __vmalloc_node_range() function.
      Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com>
      Cc: Yuri Gribov <tetra2005@gmail.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb9e3c29
  3. 13 2月, 2015 1 次提交
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
  4. 12 2月, 2015 1 次提交
  5. 11 2月, 2015 1 次提交
  6. 30 1月, 2015 1 次提交
    • L
      vm: add VM_FAULT_SIGSEGV handling support · 33692f27
      Linus Torvalds 提交于
      The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
      "you should SIGSEGV" error, because the SIGSEGV case was generally
      handled by the caller - usually the architecture fault handler.
      
      That results in lots of duplication - all the architecture fault
      handlers end up doing very similar "look up vma, check permissions, do
      retries etc" - but it generally works.  However, there are cases where
      the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
      
      In particular, when accessing the stack guard page, libsigsegv expects a
      SIGSEGV.  And it usually got one, because the stack growth is handled by
      that duplicated architecture fault handler.
      
      However, when the generic VM layer started propagating the error return
      from the stack expansion in commit fee7e49d ("mm: propagate error
      from stack expansion even for guard page"), that now exposed the
      existing VM_FAULT_SIGBUS result to user space.  And user space really
      expected SIGSEGV, not SIGBUS.
      
      To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
      duplicate architecture fault handlers about it.  They all already have
      the code to handle SIGSEGV, so it's about just tying that new return
      value to the existing code, but it's all a bit annoying.
      
      This is the mindless minimal patch to do this.  A more extensive patch
      would be to try to gather up the mostly shared fault handling logic into
      one generic helper routine, and long-term we really should do that
      cleanup.
      
      Just from this patch, you can generally see that most architectures just
      copied (directly or indirectly) the old x86 way of doing things, but in
      the meantime that original x86 model has been improved to hold the VM
      semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
      "newer" things, so it would be a good idea to bring all those
      improvements to the generic case and teach other architectures about
      them too.
      Reported-and-tested-by: NTakashi Iwai <tiwai@suse.de>
      Tested-by: NJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33692f27
  7. 20 1月, 2015 1 次提交
    • R
      module_arch_freeing_init(): new hook for archs before module->module_init freed. · d453cded
      Rusty Russell 提交于
      Archs have been abusing module_free() to clean up their arch-specific
      allocations.  Since module_free() is also (ab)used by BPF and trace code,
      let's keep it to simple allocations, and provide a hook called before
      that.
      
      This means that avr32, ia64, parisc and s390 no longer need to implement
      their own module_free() at all.  avr32 doesn't need module_finalize()
      either.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      d453cded
  8. 27 12月, 2014 1 次提交
  9. 14 12月, 2014 1 次提交
  10. 11 12月, 2014 1 次提交
  11. 06 12月, 2014 1 次提交
    • A
      net: sock: allow eBPF programs to be attached to sockets · 89aa0758
      Alexei Starovoitov 提交于
      introduce new setsockopt() command:
      
      setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd))
      
      where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...)
      and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER
      
      setsockopt() calls bpf_prog_get() which increments refcnt of the program,
      so it doesn't get unloaded while socket is using the program.
      
      The same eBPF program can be attached to multiple sockets.
      
      User task exit automatically closes socket which calls sk_filter_uncharge()
      which decrements refcnt of eBPF program
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89aa0758
  12. 12 11月, 2014 1 次提交
    • E
      net: introduce SO_INCOMING_CPU · 2c8c56e1
      Eric Dumazet 提交于
      Alternative to RPS/RFS is to use hardware support for multiple
      queues.
      
      Then split a set of million of sockets into worker threads, each
      one using epoll() to manage events on its own socket pool.
      
      Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
      know after accept() or connect() on which queue/cpu a socket is managed.
      
      We normally use one cpu per RX queue (IRQ smp_affinity being properly
      set), so remembering on socket structure which cpu delivered last packet
      is enough to solve the problem.
      
      After accept(), connect(), or even file descriptor passing around
      processes, applications can use :
      
       int cpu;
       socklen_t len = sizeof(cpu);
      
       getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
      
      And use this information to put the socket into the right silo
      for optimal performance, as all networking stack should run
      on the appropriate cpu, without need to send IPI (RPS/RFS).
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c8c56e1
  13. 11 11月, 2014 4 次提交
  14. 01 11月, 2014 1 次提交
  15. 21 10月, 2014 1 次提交
  16. 12 10月, 2014 1 次提交
    • H
      parisc: Reduce SIGRTMIN from 37 to 32 to behave like other Linux architectures · 1f25df2e
      Helge Deller 提交于
      This patch reduces the value of SIGRTMIN on PARISC from 37 to 32, thus
      increasing the number of available RT signals and bring it in sync with other
      Linux architectures.
      
      Historically we wanted to natively support HP-UX 32bit binaries with the
      PA-RISC Linux port.  Because of that we carried the various available signals
      from HP-UX (e.g. SIGEMT and SIGLOST) and folded them in between the native
      Linux signals.  Although this was the right decision at that time, this
      required us to increase SIGRTMIN to at least 37 which left us with 27 (64-37)
      RT signals.
      
      Those 27 RT signals haven't been a problem in the past, but with the upcoming
      importance of systemd we now got the problem that systemd alloctes (hardcoded)
      signals up to SIGRTMIN+29 which is beyond our NSIG of 64. Because of that we
      have not been able to use systemd on the PARISC Linux port yet.
      
      Of course we could ask the systemd developers to not use those hardcoded
      values, but this change is very unlikely, esp. with PA-RISC being a niche
      architecture.
      
      The other possibility would be to increase NSIG to e.g. 128, but this would
      mean to duplicate most of the existing Linux signal handling code into the
      parisc specific Linux kernel tree which would most likely introduce lots of new
      bugs beside the code duplication.
      
      The third option is to drop some HP-UX signals and shuffle some other signals
      around to bring SIGRTMIN to 32.  This is of course an ABI change, but testing
      has shown that existing Linux installations are not visibly affected by this
      change - most likely because we move those signals around which are rarely used
      and move them to slots which haven't been used in Linux yet. In an existing
      installation I was able to exchange either the Linux kernel or glibc (or both)
      without affecting the boot process and installed applications.
      
      Dropping the HP-UX signals isn't an issue either, since support for HP-UX was
      basically dropped a few months back with Kernel 3.14 in commit
      f5a408d5 already, when we changed EWOULDBLOCK
      to be equal to EAGAIN.
      
      So, even if this is an ABI change, it's better to change it now and thus bring
      PARISC Linux in sync with other architectures to avoid other issues in the
      future.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: Carlos O'Donell <carlos@systemhalted.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: PARISC Linux Kernel Mailinglist <linux-parisc@vger.kernel.org>
      Tested-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      1f25df2e
  17. 03 10月, 2014 1 次提交
  18. 25 9月, 2014 2 次提交
  19. 24 9月, 2014 3 次提交
    • E
      ARCH: AUDIT: audit_syscall_entry() should not require the arch · 91397401
      Eric Paris 提交于
      We have a function where the arch can be queried, syscall_get_arch().
      So rather than have every single piece of arch specific code use and/or
      duplicate syscall_get_arch(), just have the audit code use the
      syscall_get_arch() code.
      Based-on-patch-by: NRichard Briggs <rgb@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-ia64@vger.kernel.org
      Cc: microblaze-uclinux@itee.uq.edu.au
      Cc: linux-mips@linux-mips.org
      Cc: linux@lists.openrisc.net
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: linux-xtensa@linux-xtensa.org
      Cc: x86@kernel.org
      91397401
    • E
      ARCH: AUDIT: implement syscall_get_arch for all arches · ce5d1128
      Eric Paris 提交于
      For all arches which support audit implement syscall_get_arch()
      They are all pretty easy and straight forward, stolen from how the call
      to audit_syscall_entry() determines the arch.
      Based-on-patch-by: NRichard Briggs <rgb@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Cc: linux-ia64@vger.kernel.org
      Cc: microblaze-uclinux@itee.uq.edu.au
      Cc: linux-mips@linux-mips.org
      Cc: linux@lists.openrisc.net
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: sparclinux@vger.kernel.org
      ce5d1128
    • J
      parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds · d26a7730
      John David Anglin 提交于
      In spite of what the GCC manual says, the -mfast-indirect-calls has
      never been supported in the 64-bit parisc compiler. Indirect calls have
      always been done using function descriptors irrespective of the
      -mfast-indirect-calls option.
      
      Recently, it was noticed that a function descriptor was always requested
      when the -mfast-indirect-calls option was specified. This caused
      problems when the option was used in  application code and doesn't make
      any sense because the whole point of the option is to avoid using a
      function descriptor for indirect calls.
      
      Fixing this broke 64-bit kernel builds.
      
      I will fix GCC but for now we need the attached change. This results in
      the same kernel code as before.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org  # v3.0+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      d26a7730
  20. 22 9月, 2014 1 次提交
  21. 14 9月, 2014 2 次提交
  22. 10 9月, 2014 1 次提交
  23. 27 8月, 2014 2 次提交
  24. 14 8月, 2014 1 次提交
  25. 06 8月, 2014 1 次提交
  26. 28 7月, 2014 1 次提交
  27. 25 7月, 2014 2 次提交
  28. 19 7月, 2014 1 次提交