1. 24 1月, 2014 1 次提交
    • K
      kexec: add sysctl to disable kexec_load · 7984754b
      Kees Cook 提交于
      For general-purpose (i.e.  distro) kernel builds it makes sense to build
      with CONFIG_KEXEC to allow end users to choose what kind of things they
      want to do with kexec.  However, in the face of trying to lock down a
      system with such a kernel, there needs to be a way to disable kexec_load
      (much like module loading can be disabled).  Without this, it is too easy
      for the root user to modify kernel memory even when CONFIG_STRICT_DEVMEM
      and modules_disabled are set.  With this change, it is still possible to
      load an image for use later, then disable kexec_load so the image (or lack
      of image) can't be altered.
      
      The intention is for using this in environments where "perfect"
      enforcement is hard.  Without a verified boot, along with verified
      modules, and along with verified kexec, this is trying to give a system a
      better chance to defend itself (or at least grow the window of
      discoverability) against attack in the face of a privilege escalation.
      
      In my mind, I consider several boot scenarios:
      
      1) Verified boot of read-only verified root fs loading fd-based
         verification of kexec images.
      2) Secure boot of writable root fs loading signed kexec images.
      3) Regular boot loading kexec (e.g. kcrash) image early and locking it.
      4) Regular boot with no control of kexec image at all.
      
      1 and 2 don't exist yet, but will soon once the verified kexec series has
      landed.  4 is the state of things now.  The gap between 2 and 4 is too
      large, so this change creates scenario 3, a middle-ground above 4 when 2
      and 1 are not possible for a system.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7984754b
  2. 17 12月, 2013 1 次提交
  3. 13 11月, 2013 1 次提交
    • R
      vsprintf: check real user/group id for %pK · 312b4e22
      Ryan Mallon 提交于
      Some setuid binaries will allow reading of files which have read
      permission by the real user id.  This is problematic with files which
      use %pK because the file access permission is checked at open() time,
      but the kptr_restrict setting is checked at read() time.  If a setuid
      binary opens a %pK file as an unprivileged user, and then elevates
      permissions before reading the file, then kernel pointer values may be
      leaked.
      
      This happens for example with the setuid pppd application on Ubuntu 12.04:
      
        $ head -1 /proc/kallsyms
        00000000 T startup_32
      
        $ pppd file /proc/kallsyms
        pppd: In file /proc/kallsyms: unrecognized option 'c1000000'
      
      This will only leak the pointer value from the first line, but other
      setuid binaries may leak more information.
      
      Fix this by adding a check that in addition to the current process having
      CAP_SYSLOG, that effective user and group ids are equal to the real ids.
      If a setuid binary reads the contents of a file which uses %pK then the
      pointer values will be printed as NULL if the real user is unprivileged.
      
      Update the sysctl documentation to reflect the changes, and also correct
      the documentation to state the kptr_restrict=0 is the default.
      
      This is a only temporary solution to the issue.  The correct solution is
      to do the permission check at open() time on files, and to replace %pK
      with a function which checks the open() time permission.  %pK uses in
      printk should be removed since no sane permission check can be done, and
      instead protected by using dmesg_restrict.
      Signed-off-by: NRyan Mallon <rmallon@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Joe Perches <joe@perches.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      312b4e22
  4. 09 10月, 2013 5 次提交
  5. 12 9月, 2013 1 次提交
  6. 23 6月, 2013 1 次提交
    • D
      perf: Drop sample rate when sampling is too slow · 14c63f17
      Dave Hansen 提交于
      This patch keeps track of how long perf's NMI handler is taking,
      and also calculates how many samples perf can take a second.  If
      the sample length times the expected max number of samples
      exceeds a configurable threshold, it drops the sample rate.
      
      This way, we don't have a runaway sampling process eating up the
      CPU.
      
      This patch can tend to drop the sample rate down to level where
      perf doesn't work very well.  *BUT* the alternative is that my
      system hangs because it spends all of its time handling NMIs.
      
      I'll take a busted performance tool over an entire system that's
      busted and undebuggable any day.
      
      BTW, my suspicion is that there's still an underlying bug here.
      Using the HPET instead of the TSC is definitely a contributing
      factor, but I suspect there are some other things going on.
      But, I can't go dig down on a bug like that with my machine
      hanging all the time.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Cc: Dave Hansen <dave@sr71.net>
      [ Prettified it a bit. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      14c63f17
  7. 28 5月, 2013 2 次提交
  8. 05 1月, 2013 2 次提交
  9. 06 10月, 2012 1 次提交
  10. 07 2月, 2012 1 次提交
  11. 13 1月, 2012 1 次提交
  12. 05 12月, 2011 1 次提交
  13. 01 11月, 2011 1 次提交
    • D
      kernel/sysctl.c: add cap_last_cap to /proc/sys/kernel · 73efc039
      Dan Ballard 提交于
      Userspace needs to know the highest valid capability of the running
      kernel, which right now cannot reliably be retrieved from the header files
      only.  The fact that this value cannot be determined properly right now
      creates various problems for libraries compiled on newer header files
      which are run on older kernels.  They assume capabilities are available
      which actually aren't.  libcap-ng is one example.  And we ran into the
      same problem with systemd too.
      
      Now the capability is exported in /proc/sys/kernel/cap_last_cap.
      
      [akpm@linux-foundation.org: make cap_last_cap const, per Ulrich]
      Signed-off-by: NDan Ballard <dan@mindstab.net>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Lennart Poettering <lennart@poettering.net>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Ulrich Drepper <drepper@akkadia.org>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73efc039
  14. 27 7月, 2011 1 次提交
    • V
      ipc: introduce shm_rmid_forced sysctl · b34a6b1d
      Vasiliy Kulikov 提交于
      Add support for the shm_rmid_forced sysctl.  If set to 1, all shared
      memory objects in current ipc namespace will be automatically forced to
      use IPC_RMID.
      
      The POSIX way of handling shmem allows one to create shm objects and
      call shmdt(), leaving shm object associated with no process, thus
      consuming memory not counted via rlimits.
      
      With shm_rmid_forced=1 the shared memory object is counted at least for
      one process, so OOM killer may effectively kill the fat process holding
      the shared memory.
      
      It obviously breaks POSIX - some programs relying on the feature would
      stop working.  So set shm_rmid_forced=1 only if you're sure nobody uses
      "orphaned" memory.  Use shm_rmid_forced=0 by default for compatability
      reasons.
      
      The feature was previously impemented in -ow as a configure option.
      
      [akpm@linux-foundation.org: fix documentation, per Randy]
      [akpm@linux-foundation.org: fix warning]
      [akpm@linux-foundation.org: readability/conventionality tweaks]
      [akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout]
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Serge E. Hallyn" <serge.hallyn@canonical.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Solar Designer <solar@openwall.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b34a6b1d
  15. 24 7月, 2011 1 次提交
  16. 27 5月, 2011 1 次提交
    • J
      coredump: add support for exe_file in core name · 57cc083a
      Jiri Slaby 提交于
      Now, exe_file is not proc FS dependent, so we can use it to name core
      file.  So we add %E pattern for core file name cration which extract path
      from mm_struct->exe_file.  Then it converts slashes to exclamation marks
      and pastes the result to the core file name itself.
      
      This is useful for environments where binary names are longer than 16
      character (the current->comm limitation).  Also where there are binaries
      with same name but in a different path.  Further in case the binery itself
      changes its current->comm after exec.
      
      So by doing (s/$/#/ -- # is treated as git comment):
      
        $ sysctl kernel.core_pattern='core.%p.%e.%E'
        $ ln /bin/cat cat45678901234567890
        $ ./cat45678901234567890
        ^Z
        $ rm cat45678901234567890
        $ fg
        ^\Quit (core dumped)
        $ ls core*
      
      we now get:
      
        core.2434.cat456789012345.!root!cat45678901234567890 (deleted)
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Reviewed-by: NAndi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57cc083a
  17. 11 2月, 2011 1 次提交
  18. 14 1月, 2011 1 次提交
    • D
      kptr_restrict for hiding kernel pointers from unprivileged users · 455cd5ab
      Dan Rosenberg 提交于
      Add the %pK printk format specifier and the /proc/sys/kernel/kptr_restrict
      sysctl.
      
      The %pK format specifier is designed to hide exposed kernel pointers,
      specifically via /proc interfaces.  Exposing these pointers provides an
      easy target for kernel write vulnerabilities, since they reveal the
      locations of writable structures containing easily triggerable function
      pointers.  The behavior of %pK depends on the kptr_restrict sysctl.
      
      If kptr_restrict is set to 0, no deviation from the standard %p behavior
      occurs.  If kptr_restrict is set to 1, the default, if the current user
      (intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
      (currently in the LSM tree), kernel pointers using %pK are printed as 0's.
       If kptr_restrict is set to 2, kernel pointers using %pK are printed as
      0's regardless of privileges.  Replacing with 0's was chosen over the
      default "(null)", which cannot be parsed by userland %p, which expects
      "(nil)".
      
      [akpm@linux-foundation.org: check for IRQ context when !kptr_restrict, save an indent level, s/WARN/WARN_ONCE/]
      [akpm@linux-foundation.org: coding-style fixup]
      [randy.dunlap@oracle.com: fix kernel/sysctl.c warning]
      Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Thomas Graf <tgraf@infradead.org>
      Cc: Eugene Teo <eugeneteo@kernel.org>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Eric Paris <eparis@parisplace.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      455cd5ab
  19. 09 12月, 2010 1 次提交
  20. 12 11月, 2010 1 次提交
  21. 12 12月, 2009 1 次提交
  22. 09 11月, 2009 1 次提交
  23. 24 9月, 2009 1 次提交
    • N
      exec: let do_coredump() limit the number of concurrent dumps to pipes · a293980c
      Neil Horman 提交于
      Introduce core pipe limiting sysctl.
      
      Since we can dump cores to pipe, rather than directly to the filesystem,
      we create a condition in which a user can create a very high load on the
      system simply by running bad applications.
      
      If the pipe reader specified in core_pattern is poorly written, we can
      have lots of ourstandig resources and processes in the system.
      
      This sysctl introduces an ability to limit that resource consumption.
      core_pipe_limit defines how many in-flight dumps may be run in parallel,
      dumps beyond this value are skipped and a note is made in the kernel log.
      A special value of 0 in core_pipe_limit denotes unlimited core dumps may
      be handled (this is the default value).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reported-by: NEarl Chew <earl_chew@agilent.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a293980c
  24. 23 9月, 2009 1 次提交
  25. 21 9月, 2009 1 次提交
  26. 11 9月, 2009 1 次提交
  27. 03 4月, 2009 2 次提交
    • S
      documentation: update Documentation/filesystem/proc.txt and Documentation/sysctls · 760df93e
      Shen Feng 提交于
      Now /proc/sys is described in many places and much information is
      redundant.  This patch updates the proc.txt and move the /proc/sys
      desciption out to the files in Documentation/sysctls.
      
      Details are:
      
      merge
      -  2.1  /proc/sys/fs - File system data
      -  2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem
      -  2.17 /proc/sys/fs/epoll - Configuration options for the epoll interface
      with Documentation/sysctls/fs.txt.
      
      remove
      -  2.2  /proc/sys/fs/binfmt_misc - Miscellaneous binary formats
      since it's not better then the Documentation/binfmt_misc.txt.
      
      merge
      -  2.3  /proc/sys/kernel - general kernel parameters
      with Documentation/sysctls/kernel.txt
      
      remove
      -  2.5  /proc/sys/dev - Device specific parameters
      since it's obsolete the sysfs is used now.
      
      remove
      -  2.6  /proc/sys/sunrpc - Remote procedure calls
      since it's not better then the Documentation/sysctls/sunrpc.txt
      
      move
      -  2.7  /proc/sys/net - Networking stuff
      -  2.9  Appletalk
      -  2.10 IPX
      to newly created Documentation/sysctls/net.txt.
      
      remove
      -  2.8  /proc/sys/net/ipv4 - IPV4 settings
      since it's not better then the Documentation/networking/ip-sysctl.txt.
      
      add
      - Chapter 3 Per-Process Parameters
      to descibe /proc/<pid>/xxx parameters.
      Signed-off-by: NShen Feng <shen@cn.fujitsu.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      760df93e
    • K
      modules: sysctl to block module loading · 3d43321b
      Kees Cook 提交于
      Implement a sysctl file that disables module-loading system-wide since
      there is no longer a viable way to remove CAP_SYS_MODULE after the system
      bounding capability set was removed in 2.6.25.
      
      Value can only be set to "1", and is tested only if standard capability
      checks allow CAP_SYS_MODULE.  Given existing /dev/mem protections, this
      should allow administrators a one-way method to block module loading
      after initial boot-time module loading has finished.
      Signed-off-by: NKees Cook <kees.cook@canonical.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      3d43321b
  28. 30 10月, 2008 1 次提交
  29. 11 10月, 2008 1 次提交
    • G
      Staging: add TAINT_CRAP for all drivers/staging code · 061b1bd3
      Greg Kroah-Hartman 提交于
      We need to add a flag for all code that is in the drivers/staging/
      directory to prevent all other kernel developers from worrying about
      issues here, and to notify users that the drivers might not be as good
      as they are normally used to.
      
      Based on code from Andreas Gruenbacher and Jeff Mahoney to provide a
      TAINT flag for the support level of a kernel module in the Novell
      enterprise kernel release.
      
      This is the kernel portion of this feature, the ability for the flag to
      be set needs to be done in the build process and will happen in a
      follow-up patch.
      
      Cc: Andreas Gruenbacher <agruen@suse.de>
      Cc: Jeff Mahoney <jeffm@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      061b1bd3
  30. 23 9月, 2008 1 次提交
  31. 19 9月, 2008 1 次提交
  32. 14 2月, 2008 1 次提交
  33. 10 2月, 2008 1 次提交