1. 21 1月, 2016 2 次提交
    • C
      dma-mapping: always provide the dma_map_ops based implementation · e1c7e324
      Christoph Hellwig 提交于
      Move the generic implementation to <linux/dma-mapping.h> now that all
      architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
      that everyone supports them.
      
      [valentinrothberg@gmail.com: remove leftovers in Kconfig]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1c7e324
    • C
      dma-mapping: make the generic coherent dma mmap implementation optional · 0d4a619b
      Christoph Hellwig 提交于
      This series converts all remaining architectures to use dma_map_ops and
      the generic implementation of the DMA API.  This not only simplifies the
      code a lot, but also prepares for possible future changes like more
      generic non-iommu dma_ops implementations or generic per-device
      dma_map_ops.
      
      This patch (of 16):
      
      We have a couple architectures that do not want to support this code, so
      add another Kconfig symbol that disables the code similar to what we do
      for the nommu case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d4a619b
  2. 15 1月, 2016 1 次提交
    • D
      mm: mmap: add new /proc tunable for mmap_base ASLR · d07e2259
      Daniel Cashman 提交于
      Address Space Layout Randomization (ASLR) provides a barrier to
      exploitation of user-space processes in the presence of security
      vulnerabilities by making it more difficult to find desired code/data
      which could help an attack.  This is done by adding a random offset to
      the location of regions in the process address space, with a greater
      range of potential offset values corresponding to better protection/a
      larger search-space for brute force, but also to greater potential for
      fragmentation.
      
      The offset added to the mmap_base address, which provides the basis for
      the majority of the mappings for a process, is set once on process exec
      in arch_pick_mmap_layout() and is done via hard-coded per-arch values,
      which reflect, hopefully, the best compromise for all systems.  The
      trade-off between increased entropy in the offset value generation and
      the corresponding increased variability in address space fragmentation
      is not absolute, however, and some platforms may tolerate higher amounts
      of entropy.  This patch introduces both new Kconfig values and a sysctl
      interface which may be used to change the amount of entropy used for
      offset generation on a system.
      
      The direct motivation for this change was in response to the
      libstagefright vulnerabilities that affected Android, specifically to
      information provided by Google's project zero at:
      
        http://googleprojectzero.blogspot.com/2015/09/stagefrightened.html
      
      The attack presented therein, by Google's project zero, specifically
      targeted the limited randomness used to generate the offset added to the
      mmap_base address in order to craft a brute-force-based attack.
      Concretely, the attack was against the mediaserver process, which was
      limited to respawning every 5 seconds, on an arm device.  The hard-coded
      8 bits used resulted in an average expected success rate of defeating
      the mmap ASLR after just over 10 minutes (128 tries at 5 seconds a
      piece).  With this patch, and an accompanying increase in the entropy
      value to 16 bits, the same attack would take an average expected time of
      over 45 hours (32768 tries), which makes it both less feasible and more
      likely to be noticed.
      
      The introduced Kconfig and sysctl options are limited by per-arch
      minimum and maximum values, the minimum of which was chosen to match the
      current hard-coded value and the maximum of which was chosen so as to
      give the greatest flexibility without generating an invalid mmap_base
      address, generally a 3-4 bits less than the number of bits in the
      user-space accessible virtual address space.
      
      When decided whether or not to change the default value, a system
      developer should consider that mmap_base address could be placed
      anywhere up to 2^(value) bits away from the non-randomized location,
      which would introduce variable-sized areas above and below the mmap_base
      address such that the maximum vm_area_struct size may be reduced,
      preventing very large allocations.
      
      This patch (of 4):
      
      ASLR only uses as few as 8 bits to generate the random offset for the
      mmap base address on 32 bit architectures.  This value was chosen to
      prevent a poorly chosen value from dividing the address space in such a
      way as to prevent large allocations.  This may not be an issue on all
      platforms.  Allow the specification of a minimum number of bits so that
      platforms desiring greater ASLR protection may determine where to place
      the trade-off.
      Signed-off-by: NDaniel Cashman <dcashman@google.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Nick Kralevich <nnk@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d07e2259
  3. 11 9月, 2015 1 次提交
    • D
      kexec: split kexec_load syscall from kexec core code · 2965faa5
      Dave Young 提交于
      There are two kexec load syscalls, kexec_load another and kexec_file_load.
       kexec_file_load has been splited as kernel/kexec_file.c.  In this patch I
      split kexec_load syscall code to kernel/kexec.c.
      
      And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
      use kexec_file_load only, or vice verse.
      
      The original requirement is from Ted Ts'o, he want kexec kernel signature
      being checked with CONFIG_KEXEC_VERIFY_SIG enabled.  But kexec-tools use
      kexec_load syscall can bypass the checking.
      
      Vivek Goyal proposed to create a common kconfig option so user can compile
      in only one syscall for loading kexec kernel.  KEXEC/KEXEC_FILE selects
      KEXEC_CORE so that old config files still work.
      
      Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
      architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
      KEXEC_CORE in arch Kconfig.  Also updated general kernel code with to
      kexec_load syscall.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2965faa5
  4. 15 8月, 2015 1 次提交
  5. 03 8月, 2015 1 次提交
  6. 18 7月, 2015 1 次提交
  7. 26 6月, 2015 1 次提交
    • J
      clone: support passing tls argument via C rather than pt_regs magic · 3033f14a
      Josh Triplett 提交于
      clone has some of the quirkiest syscall handling in the kernel, with a
      pile of special cases, historical curiosities, and architecture-specific
      calling conventions.  In particular, clone with CLONE_SETTLS accepts a
      parameter "tls" that the C entry point completely ignores and some
      assembly entry points overwrite; instead, the low-level arch-specific
      code pulls the tls parameter out of the arch-specific register captured
      as part of pt_regs on entry to the kernel.  That's a massive hack, and
      it makes the arch-specific code only work when called via the specific
      existing syscall entry points; because of this hack, any new clone-like
      system call would have to accept an identical tls argument in exactly
      the same arch-specific position, rather than providing a unified system
      call entry point across architectures.
      
      The first patch allows architectures to handle the tls argument via
      normal C parameter passing, if they opt in by selecting
      HAVE_COPY_THREAD_TLS.  The second patch makes 32-bit and 64-bit x86 opt
      into this.
      
      These two patches came out of the clone4 series, which isn't ready for
      this merge window, but these first two cleanup patches were entirely
      uncontroversial and have acks.  I'd like to go ahead and submit these
      two so that other architectures can begin building on top of this and
      opting into HAVE_COPY_THREAD_TLS.  However, I'm also happy to wait and
      send these through the next merge window (along with v3 of clone4) if
      anyone would prefer that.
      
      This patch (of 2):
      
      clone with CLONE_SETTLS accepts an argument to set the thread-local
      storage area for the new thread.  sys_clone declares an int argument
      tls_val in the appropriate point in the argument list (based on the
      various CLONE_BACKWARDS variants), but doesn't actually use or pass along
      that argument.  Instead, sys_clone calls do_fork, which calls
      copy_process, which calls the arch-specific copy_thread, and copy_thread
      pulls the corresponding syscall argument out of the pt_regs captured at
      kernel entry (knowing what argument of clone that architecture passes tls
      in).
      
      Apart from being awful and inscrutable, that also only works because only
      one code path into copy_thread can pass the CLONE_SETTLS flag, and that
      code path comes from sys_clone with its architecture-specific
      argument-passing order.  This prevents introducing a new version of the
      clone system call without propagating the same architecture-specific
      position of the tls argument.
      
      However, there's no reason to pull the argument out of pt_regs when
      sys_clone could just pass it down via C function call arguments.
      
      Introduce a new CONFIG_HAVE_COPY_THREAD_TLS for architectures to opt into,
      and a new copy_thread_tls that accepts the tls parameter as an additional
      unsigned long (syscall-argument-sized) argument.  Change sys_clone's tls
      argument to an unsigned long (which does not change the ABI), and pass
      that down to copy_thread_tls.
      
      Architectures that don't opt into copy_thread_tls will continue to ignore
      the C argument to sys_clone in favor of the pt_regs captured at kernel
      entry, and thus will be unable to introduce new versions of the clone
      syscall.
      
      Patch co-authored by Josh Triplett and Thiago Macieira.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thiago Macieira <thiago.macieira@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3033f14a
  8. 15 4月, 2015 4 次提交
    • K
      mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE · 204db6ed
      Kees Cook 提交于
      The arch_randomize_brk() function is used on several architectures,
      even those that don't support ET_DYN ASLR. To avoid bulky extern/#define
      tricks, consolidate the support under CONFIG_ARCH_HAS_ELF_RANDOMIZE for
      the architectures that support it, while still handling CONFIG_COMPAT_BRK.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Russell King <linux@arm.linux.org.uk>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: "David A. Long" <dave.long@linaro.org>
      Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Arun Chandran <achandran@mvista.com>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      Cc: Min-Hua Chen <orca.chen@gmail.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Vineeth Vijayan <vvijayan@mvista.com>
      Cc: Jeff Bailey <jeffbailey@google.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Behan Webster <behanw@converseincode.com>
      Cc: Ismael Ripoll <iripoll@upv.es>
      Cc: Jan-Simon Mller <dl9pf@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      204db6ed
    • K
      mm: expose arch_mmap_rnd when available · 2b68f6ca
      Kees Cook 提交于
      When an architecture fully supports randomizing the ELF load location,
      a per-arch mmap_rnd() function is used to find a randomized mmap base.
      In preparation for randomizing the location of ET_DYN binaries
      separately from mmap, this renames and exports these functions as
      arch_mmap_rnd(). Additionally introduces CONFIG_ARCH_HAS_ELF_RANDOMIZE
      for describing this feature on architectures that support it
      (which is a superset of ARCH_BINFMT_ELF_RANDOMIZE_PIE, since s390
      already supports a separated ET_DYN ASLR from mmap ASLR without the
      ARCH_BINFMT_ELF_RANDOMIZE_PIE logic).
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Russell King <linux@arm.linux.org.uk>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: "David A. Long" <dave.long@linaro.org>
      Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Arun Chandran <achandran@mvista.com>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      Cc: Min-Hua Chen <orca.chen@gmail.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Vineeth Vijayan <vvijayan@mvista.com>
      Cc: Jeff Bailey <jeffbailey@google.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Behan Webster <behanw@converseincode.com>
      Cc: Ismael Ripoll <iripoll@upv.es>
      Cc: Jan-Simon Mller <dl9pf@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b68f6ca
    • T
      lib/ioremap.c: add huge I/O map capability interfaces · 0ddab1d2
      Toshi Kani 提交于
      Add ioremap_pud_enabled() and ioremap_pmd_enabled(), which return 1 when
      I/O mappings with pud/pmd are enabled on the kernel.
      
      ioremap_huge_init() calls arch_ioremap_pud_supported() and
      arch_ioremap_pmd_supported() to initialize the capabilities at boot-time.
      
      A new kernel option "nohugeiomap" is also added, so that user can disable
      the huge I/O map capabilities when necessary.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Robert Elliott <Elliott@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ddab1d2
    • K
      mm: define default PGTABLE_LEVELS to two · 235a8f02
      Kirill A. Shutemov 提交于
      By this time all architectures which support more than two page table
      levels should be covered.  This patch add default definiton of
      PGTABLE_LEVELS equal 2.
      
      We also add assert to detect inconsistence between CONFIG_PGTABLE_LEVELS
      and __PAGETABLE_PMD_FOLDED/__PAGETABLE_PUD_FOLDED.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      235a8f02
  9. 11 4月, 2015 1 次提交
  10. 04 9月, 2014 1 次提交
  11. 19 7月, 2014 1 次提交
    • K
      seccomp: add "seccomp" syscall · 48dc92b9
      Kees Cook 提交于
      This adds the new "seccomp" syscall with both an "operation" and "flags"
      parameter for future expansion. The third argument is a pointer value,
      used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
      be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).
      
      In addition to the TSYNC flag later in this patch series, there is a
      non-zero chance that this syscall could be used for configuring a fixed
      argument area for seccomp-tracer-aware processes to pass syscall arguments
      in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter"
      for this syscall. Additionally, this syscall uses operation, flags,
      and user pointer for arguments because strictly passing arguments via
      a user pointer would mean seccomp itself would be unable to trivially
      filter the seccomp syscall itself.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
      48dc92b9
  12. 19 3月, 2014 1 次提交
  13. 20 12月, 2013 2 次提交
    • K
      stackprotector: Introduce CONFIG_CC_STACKPROTECTOR_STRONG · 8779657d
      Kees Cook 提交于
      This changes the stack protector config option into a choice of
      "None", "Regular", and "Strong":
      
         CONFIG_CC_STACKPROTECTOR_NONE
         CONFIG_CC_STACKPROTECTOR_REGULAR
         CONFIG_CC_STACKPROTECTOR_STRONG
      
      "Regular" means the old CONFIG_CC_STACKPROTECTOR=y option.
      
      "Strong" is a new mode introduced by this patch. With "Strong" the
      kernel is built with -fstack-protector-strong (available in
      gcc 4.9 and later). This option increases the coverage of the stack
      protector without the heavy performance hit of -fstack-protector-all.
      
      For reference, the stack protector options available in gcc are:
      
      -fstack-protector-all:
        Adds the stack-canary saving prefix and stack-canary checking
        suffix to _all_ function entry and exit. Results in substantial
        use of stack space for saving the canary for deep stack users
        (e.g. historically xfs), and measurable (though shockingly still
        low) performance hit due to all the saving/checking. Really not
        suitable for sane systems, and was entirely removed as an option
        from the kernel many years ago.
      
      -fstack-protector:
        Adds the canary save/check to functions that define an 8
        (--param=ssp-buffer-size=N, N=8 by default) or more byte local
        char array. Traditionally, stack overflows happened with
        string-based manipulations, so this was a way to find those
        functions. Very few total functions actually get the canary; no
        measurable performance or size overhead.
      
      -fstack-protector-strong
        Adds the canary for a wider set of functions, since it's not
        just those with strings that have ultimately been vulnerable to
        stack-busting. With this superset, more functions end up with a
        canary, but it still remains small compared to all functions
        with only a small change in performance. Based on the original
        design document, a function gets the canary when it contains any
        of:
      
          - local variable's address used as part of the right hand side
            of an assignment or function argument
          - local variable is an array (or union containing an array),
            regardless of array type or length
          - uses register local variables
      
        https://docs.google.com/a/google.com/document/d/1xXBH6rRZue4f296vGt9YQcuLVQHeE516stHwt8M9xyU
      
      Find below a comparison of "size" and "objdump" output when built with
      gcc-4.9 in three configurations:
      
        - defconfig
      	11430641 kernel text size
      	36110 function bodies
      
        - defconfig + CONFIG_CC_STACKPROTECTOR_REGULAR
      	11468490 kernel text size (+0.33%)
      	1015 of 36110 functions are stack-protected (2.81%)
      
        - defconfig + CONFIG_CC_STACKPROTECTOR_STRONG via this patch
      	11692790 kernel text size (+2.24%)
      	7401 of 36110 functions are stack-protected (20.5%)
      
      With -strong, ARM's compressed boot code now triggers stack
      protection, so a static guard was added. Since this is only used
      during decompression and was never used before, the exposure
      here is very small. Once it switches to the full kernel, the
      stack guard is back to normal.
      
      Chrome OS has been using -fstack-protector-strong for its kernel
      builds for the last 8 months with no problems.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/1387481759-14535-3-git-send-email-keescook@chromium.org
      [ Improved the changelog and descriptions some more. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8779657d
    • K
      stackprotector: Unify the HAVE_CC_STACKPROTECTOR logic between architectures · 19952a92
      Kees Cook 提交于
      Instead of duplicating the CC_STACKPROTECTOR Kconfig and
      Makefile logic in each architecture, switch to using
      HAVE_CC_STACKPROTECTOR and keep everything in one place. This
      retains the x86-specific bug verification scripts.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/1387481759-14535-2-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      19952a92
  14. 15 11月, 2013 1 次提交
  15. 01 10月, 2013 1 次提交
    • F
      irq: Optimize softirq stack selection in irq exit · cc1f0274
      Frederic Weisbecker 提交于
      If irq_exit() is called on the arch's specified irq stack,
      it should be safe to run softirqs inline under that same
      irq stack as it is near empty by the time we call irq_exit().
      
      For example if we use the same stack for both hard and soft irqs here,
      the worst case scenario is:
      hardirq -> softirq -> hardirq. But then the softirq supersedes the
      first hardirq as the stack user since irq_exit() is called in
      a mostly empty stack. So the stack merge in this case looks acceptable.
      
      Stack overrun still have a chance to happen if hardirqs have more
      opportunities to nest, but then it's another problem to solve.
      
      So lets adapt the irq exit's softirq stack on top of a new Kconfig symbol
      that can be defined when irq_exit() runs on the irq stack. That way
      we can spare some stack switch on irq processing and all the cache
      issues that come along.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@au1.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Mackerras <paulus@au1.ibm.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      cc1f0274
  16. 30 9月, 2013 1 次提交
    • K
      vtime: Add HAVE_VIRT_CPU_ACCOUNTING_GEN Kconfig · 554b0004
      Kevin Hilman 提交于
      With VIRT_CPU_ACCOUNTING_GEN, cputime_t becomes 64-bit. In order
      to use that feature, arch code should be audited to ensure there are no
      races in concurrent read/write of cputime_t. For example,
      reading/writing 64-bit cputime_t on some 32-bit arches may require
      multiple accesses for low and high value parts, so proper locking
      is needed to protect against concurrent accesses.
      
      Therefore, add CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN which arches can
      enable after they've been audited for potential races.
      
      This option is automatically enabled on 64-bit platforms.
      
      Feature requested by Frederic Weisbecker.
      Signed-off-by: NKevin Hilman <khilman@linaro.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Arm Linux <linux-arm-kernel@lists.infradead.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      554b0004
  17. 28 9月, 2013 1 次提交
  18. 14 8月, 2013 1 次提交
  19. 04 7月, 2013 1 次提交
    • P
      mm: soft-dirty bits for user memory changes tracking · 0f8975ec
      Pavel Emelyanov 提交于
      The soft-dirty is a bit on a PTE which helps to track which pages a task
      writes to.  In order to do this tracking one should
      
        1. Clear soft-dirty bits from PTEs ("echo 4 > /proc/PID/clear_refs)
        2. Wait some time.
        3. Read soft-dirty bits (55'th in /proc/PID/pagemap2 entries)
      
      To do this tracking, the writable bit is cleared from PTEs when the
      soft-dirty bit is.  Thus, after this, when the task tries to modify a
      page at some virtual address the #PF occurs and the kernel sets the
      soft-dirty bit on the respective PTE.
      
      Note, that although all the task's address space is marked as r/o after
      the soft-dirty bits clear, the #PF-s that occur after that are processed
      fast.  This is so, since the pages are still mapped to physical memory,
      and thus all the kernel does is finds this fact out and puts back
      writable, dirty and soft-dirty bits on the PTE.
      
      Another thing to note, is that when mremap moves PTEs they are marked
      with soft-dirty as well, since from the user perspective mremap modifies
      the virtual memory at mremap's new address.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0f8975ec
  20. 05 5月, 2013 1 次提交
    • K
      idle: Fix hlt/nohlt command-line handling in new generic idle · 485cf5da
      Kevin Hilman 提交于
      commit d1669912 (idle: Implement generic idle function) added a new
      generic idle along with support for hlt/nohlt command line options to
      override default idle loop behavior.  However, the command-line
      processing is never compiled.
      
      The command-line handling is wrapped by CONFIG_GENERIC_IDLE_POLL_SETUP
      and arches that use this feature select it in their Kconfigs.
      However, no Kconfig definition was created for this option, so it is
      never enabled, and therefore command-line override of the idle-loop
      behavior is broken after migrating to the generic idle loop.
      
      To fix, add a Kconfig definition for GENERIC_IDLE_POLL_SETUP.
      
      Tested on ARM (OMAP4/Panda) which enables the command-line overrides
      by default.
      Signed-off-by: NKevin Hilman <khilman@linaro.org>
      Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linaro-kernel@lists.linaro.org
      Link: http://lkml.kernel.org/r/1366849153-25564-1-git-send-email-khilman@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      485cf5da
  21. 17 4月, 2013 1 次提交
  22. 08 4月, 2013 1 次提交
  23. 15 3月, 2013 1 次提交
    • R
      CONFIG_SYMBOL_PREFIX: cleanup. · b92021b0
      Rusty Russell 提交于
      We have CONFIG_SYMBOL_PREFIX, which three archs define to the string
      "_".  But Al Viro broke this in "consolidate cond_syscall and
      SYSCALL_ALIAS declarations" (in linux-next), and he's not the first to
      do so.
      
      Using CONFIG_SYMBOL_PREFIX is awkward, since we usually just want to
      prefix it so something.  So various places define helpers which are
      defined to nothing if CONFIG_SYMBOL_PREFIX isn't set:
      
      1) include/asm-generic/unistd.h defines __SYMBOL_PREFIX.
      2) include/asm-generic/vmlinux.lds.h defines VMLINUX_SYMBOL(sym)
      3) include/linux/export.h defines MODULE_SYMBOL_PREFIX.
      4) include/linux/kernel.h defines SYMBOL_PREFIX (which differs from #7)
      5) kernel/modsign_certificate.S defines ASM_SYMBOL(sym)
      6) scripts/modpost.c defines MODULE_SYMBOL_PREFIX
      7) scripts/Makefile.lib defines SYMBOL_PREFIX on the commandline if
         CONFIG_SYMBOL_PREFIX is set, so that we have a non-string version
         for pasting.
      
      (arch/h8300/include/asm/linkage.h defines SYMBOL_NAME(), too).
      
      Let's solve this properly:
      1) No more generic prefix, just CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX.
      2) Make linux/export.h usable from asm.
      3) Define VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR().
      4) Make everyone use them.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Reviewed-by: NJames Hogan <james.hogan@imgtec.com>
      Tested-by: James Hogan <james.hogan@imgtec.com> (metag)
      b92021b0
  24. 13 3月, 2013 1 次提交
  25. 04 3月, 2013 1 次提交
  26. 03 3月, 2013 1 次提交
    • J
      Add HAVE_64BIT_ALIGNED_ACCESS · c19fa94a
      James Hogan 提交于
      On 64 bit architectures with no efficient unaligned access, padding and
      explicit alignment must be added in various places to prevent unaligned
      64bit accesses (such as taskstats and trace ring buffer).
      
      However this also needs to apply to 32 bit architectures with 64 bit
      accesses requiring alignment such as metag.
      
      This is solved by adding a new Kconfig symbol HAVE_64BIT_ALIGNED_ACCESS
      which defaults to 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS, and can be
      explicitly selected by METAG and any other relevant architectures. This
      can be used in various places to determine whether 64bit alignment is
      required.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Will Drewry <wad@chromium.org>
      c19fa94a
  27. 28 2月, 2013 1 次提交
  28. 14 2月, 2013 1 次提交
    • A
      burying unused conditionals · d64008a8
      Al Viro 提交于
      __ARCH_WANT_SYS_RT_SIGACTION,
      __ARCH_WANT_SYS_RT_SIGSUSPEND,
      __ARCH_WANT_COMPAT_SYS_RT_SIGSUSPEND,
      __ARCH_WANT_COMPAT_SYS_SCHED_RR_GET_INTERVAL - not used anymore
      CONFIG_GENERIC_{SIGALTSTACK,COMPAT_RT_SIG{ACTION,QUEUEINFO,PENDING,PROCMASK}} -
      can be assumed always set.
      d64008a8
  29. 04 2月, 2013 7 次提交