1. 17 4月, 2015 1 次提交
  2. 15 4月, 2015 1 次提交
    • U
      watchdog: enable the new user interface of the watchdog mechanism · 195daf66
      Ulrich Obergfell 提交于
      With the current user interface of the watchdog mechanism it is only
      possible to disable or enable both lockup detectors at the same time.
      This series introduces new kernel parameters and changes the semantics of
      some existing kernel parameters, so that the hard lockup detector and the
      soft lockup detector can be disabled or enabled individually.  With this
      series applied, the user interface is as follows.
      
      - parameters in /proc/sys/kernel
      
        . soft_watchdog
          This is a new parameter to control and examine the run state of
          the soft lockup detector.
      
        . nmi_watchdog
          The semantics of this parameter have changed. It can now be used
          to control and examine the run state of the hard lockup detector.
      
        . watchdog
          This parameter is still available to control the run state of both
          lockup detectors at the same time. If this parameter is examined,
          it shows the logical OR of soft_watchdog and nmi_watchdog.
      
        . watchdog_thresh
          The semantics of this parameter are not affected by the patch.
      
      - kernel command line parameters
      
        . nosoftlockup
          The semantics of this parameter have changed. It can now be used
          to disable the soft lockup detector at boot time.
      
        . nmi_watchdog=0 or nmi_watchdog=1
          Disable or enable the hard lockup detector at boot time. The patch
          introduces '=1' as a new option.
      
        . nowatchdog
          The semantics of this parameter are not affected by the patch. It
          is still available to disable both lockup detectors at boot time.
      
      Also, remove the proc_dowatchdog() function which is no longer needed.
      
      [dzickus@redhat.com: wrote changelog]
      [dzickus@redhat.com: update documentation for kernel params and sysctl]
      Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      195daf66
  3. 22 12月, 2014 1 次提交
  4. 14 12月, 2014 1 次提交
    • M
      ipc/msg: increase MSGMNI, remove scaling · 0050ee05
      Manfred Spraul 提交于
      SysV can be abused to allocate locked kernel memory.  For most systems, a
      small limit doesn't make sense, see the discussion with regards to SHMMAX.
      
      Therefore: increase MSGMNI to the maximum supported.
      
      And: If we ignore the risk of locking too much memory, then an automatic
      scaling of MSGMNI doesn't make sense.  Therefore the logic can be removed.
      
      The code preserves auto_msgmni to avoid breaking any user space applications
      that expect that the value exists.
      
      Notes:
      1) If an administrator must limit the memory allocations, then he can set
      MSGMNI as necessary.
      
      Or he can disable sysv entirely (as e.g. done by Android).
      
      2) MSGMAX and MSGMNB are intentionally not increased, as these values are used
      to control latency vs. throughput:
      If MSGMNB is large, then msgsnd() just returns and more messages can be queued
      before a task switch to a task that calls msgrcv() is forced.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0050ee05
  5. 11 12月, 2014 1 次提交
    • P
      kernel: add panic_on_warn · 9e3961a0
      Prarit Bhargava 提交于
      There have been several times where I have had to rebuild a kernel to
      cause a panic when hitting a WARN() in the code in order to get a crash
      dump from a system.  Sometimes this is easy to do, other times (such as
      in the case of a remote admin) it is not trivial to send new images to
      the user.
      
      A much easier method would be a switch to change the WARN() over to a
      panic.  This makes debugging easier in that I can now test the actual
      image the WARN() was seen on and I do not have to engage in remote
      debugging.
      
      This patch adds a panic_on_warn kernel parameter and
      /proc/sys/kernel/panic_on_warn calls panic() in the
      warn_slowpath_common() path.  The function will still print out the
      location of the warning.
      
      An example of the panic_on_warn output:
      
      The first line below is from the WARN_ON() to output the WARN_ON()'s
      location.  After that the panic() output is displayed.
      
          WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]()
          Kernel panic - not syncing: panic_on_warn set ...
      
          CPU: 30 PID: 11698 Comm: insmod Tainted: G        W  OE  3.17.0+ #57
          Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
           0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190
           0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec
           ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df
          Call Trace:
           [<ffffffff81665190>] dump_stack+0x46/0x58
           [<ffffffff8165e2ec>] panic+0xd0/0x204
           [<ffffffffa038e05f>] ? init_dummy+0x1f/0x30 [dummy_module]
           [<ffffffff81076b90>] warn_slowpath_common+0xd0/0xd0
           [<ffffffffa038e040>] ? dummy_greetings+0x40/0x40 [dummy_module]
           [<ffffffff81076c8a>] warn_slowpath_null+0x1a/0x20
           [<ffffffffa038e05f>] init_dummy+0x1f/0x30 [dummy_module]
           [<ffffffff81002144>] do_one_initcall+0xd4/0x210
           [<ffffffff811b52c2>] ? __vunmap+0xc2/0x110
           [<ffffffff810f8889>] load_module+0x16a9/0x1b30
           [<ffffffff810f3d30>] ? store_uevent+0x70/0x70
           [<ffffffff810f49b9>] ? copy_module_from_fd.isra.44+0x129/0x180
           [<ffffffff810f8ec6>] SyS_finit_module+0xa6/0xd0
           [<ffffffff8166cf29>] system_call_fastpath+0x12/0x17
      
      Successfully tested by me.
      
      hpa said: There is another very valid use for this: many operators would
      rather a machine shuts down than being potentially compromised either
      functionally or security-wise.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Acked-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e3961a0
  6. 14 10月, 2014 1 次提交
    • O
      coredump: add %i/%I in core_pattern to report the tid of the crashed thread · b03023ec
      Oleg Nesterov 提交于
      format_corename() can only pass the leader's pid to the core handler,
      but there is no simple way to figure out which thread originated the
      coredump.
      
      As Jan explains, this also means that there is no simple way to create
      the backtrace of the crashed process:
      
      As programs are mostly compiled with implicit gcc -fomit-frame-pointer
      one needs program's .eh_frame section (equivalently PT_GNU_EH_FRAME
      segment) or .debug_frame section.  .debug_frame usually is present only
      in separate debug info files usually not even installed on the system.
      While .eh_frame is a part of the executable/library (and it is even
      always mapped for C++ exceptions unwinding) it no longer has to be
      present anywhere on the disk as the program could be upgraded in the
      meantime and the running instance has its executable file already
      unlinked from disk.
      
      One possibility is to echo 0x3f >/proc/*/coredump_filter and dump all
      the file-backed memory including the executable's .eh_frame section.
      But that can create huge core files, for example even due to mmapped
      data files.
      
      Other possibility would be to read .eh_frame from /proc/PID/mem at the
      core_pattern handler time of the core dump.  For the backtrace one needs
      to read the register state first which can be done from core_pattern
      handler:
      
          ptrace(PTRACE_SEIZE, tid, 0, PTRACE_O_TRACEEXIT)
          close(0);    // close pipe fd to resume the sleeping dumper
          waitpid();   // should report EXIT
          PTRACE_GETREGS or other requests
      
      The remaining problem is how to get the 'tid' value of the crashed
      thread.  It could be read from the first NT_PRSTATUS note of the core
      file but that makes the core_pattern handler complicated.
      
      Unfortunately %t is already used so this patch uses %i/%I.
      
      Automatic Bug Reporting Tool (https://github.com/abrt/abrt/wiki/overview)
      is experimenting with this.  It is using the elfutils
      (https://fedorahosted.org/elfutils/) unwinder for generating the
      backtraces.  Apart from not needing matching executables as mentioned
      above, another advantage is that we can get the backtrace without saving
      the core (which might be quite large) to disk.
      
      [mmilata@redhat.com: final paragraph of changelog]
      Signed-off-by: NJan Kratochvil <jan.kratochvil@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
      Cc: Mark Wielaard <mjw@redhat.com>
      Cc: Martin Milata <mmilata@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b03023ec
  7. 09 8月, 2014 1 次提交
  8. 24 6月, 2014 1 次提交
    • A
      kernel/watchdog.c: print traces for all cpus on lockup detection · ed235875
      Aaron Tomlin 提交于
      A 'softlockup' is defined as a bug that causes the kernel to loop in
      kernel mode for more than a predefined period to time, without giving
      other tasks a chance to run.
      
      Currently, upon detection of this condition by the per-cpu watchdog
      task, debug information (including a stack trace) is sent to the system
      log.
      
      On some occasions, we have observed that the "victim" rather than the
      actual "culprit" (i.e.  the owner/holder of the contended resource) is
      reported to the user.  Often this information has proven to be
      insufficient to assist debugging efforts.
      
      To avoid loss of useful debug information, for architectures which
      support NMI, this patch makes it possible to improve soft lockup
      reporting.  This is accomplished by issuing an NMI to each cpu to obtain
      a stack trace.
      
      If NMI is not supported we just revert back to the old method.  A sysctl
      and boot-time parameter is available to toggle this feature.
      
      [dzickus@redhat.com: add CONFIG_SMP in certain areas]
      [akpm@linux-foundation.org: additional CONFIG_SMP=n optimisations]
      [mq@suse.cz: fix warning]
      Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NJan Moskyto Matejka <mq@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed235875
  9. 07 6月, 2014 1 次提交
    • K
      sysctl: allow for strict write position handling · f4aacea2
      Kees Cook 提交于
      When writing to a sysctl string, each write, regardless of VFS position,
      begins writing the string from the start.  This means the contents of
      the last write to the sysctl controls the string contents instead of the
      first:
      
        open("/proc/sys/kernel/modprobe", O_WRONLY)   = 1
        write(1, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 4096) = 4096
        write(1, "/bin/true", 9)                = 9
        close(1)                                = 0
      
        $ cat /proc/sys/kernel/modprobe
        /bin/true
      
      Expected behaviour would be to have the sysctl be "AAAA..." capped at
      maxlen (in this case KMOD_PATH_LEN: 256), instead of truncating to the
      contents of the second write.  Similarly, multiple short writes would
      not append to the sysctl.
      
      The old behavior is unlike regular POSIX files enough that doing audits
      of software that interact with sysctls can end up in unexpected or
      dangerous situations.  For example, "as long as the input starts with a
      trusted path" turns out to be an insufficient filter, as what must also
      happen is for the input to be entirely contained in a single write
      syscall -- not a common consideration, especially for high level tools.
      
      This provides kernel.sysctl_writes_strict as a way to make this behavior
      act in a less surprising manner for strings, and disallows non-zero file
      position when writing numeric sysctls (similar to what is already done
      when reading from non-zero file positions).  For now, the default (0) is
      to warn about non-zero file position use, but retain the legacy
      behavior.  Setting this to -1 disables the warning, and setting this to
      1 enables the file position respecting behavior.
      
      [akpm@linux-foundation.org: fix build]
      [akpm@linux-foundation.org: move misplaced hunk, per Randy]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4aacea2
  10. 08 4月, 2014 1 次提交
  11. 13 3月, 2014 1 次提交
    • M
      Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE · 66cc69e3
      Mathieu Desnoyers 提交于
      Users have reported being unable to trace non-signed modules loaded
      within a kernel supporting module signature.
      
      This is caused by tracepoint.c:tracepoint_module_coming() refusing to
      take into account tracepoints sitting within force-loaded modules
      (TAINT_FORCED_MODULE). The reason for this check, in the first place, is
      that a force-loaded module may have a struct module incompatible with
      the layout expected by the kernel, and can thus cause a kernel crash
      upon forced load of that module on a kernel with CONFIG_TRACEPOINTS=y.
      
      Tracepoints, however, specifically accept TAINT_OOT_MODULE and
      TAINT_CRAP, since those modules do not lead to the "very likely system
      crash" issue cited above for force-loaded modules.
      
      With kernels having CONFIG_MODULE_SIG=y (signed modules), a non-signed
      module is tainted re-using the TAINT_FORCED_MODULE taint flag.
      Unfortunately, this means that Tracepoints treat that module as a
      force-loaded module, and thus silently refuse to consider any tracepoint
      within this module.
      
      Since an unsigned module does not fit within the "very likely system
      crash" category of tainting, add a new TAINT_UNSIGNED_MODULE taint flag
      to specifically address this taint behavior, and accept those modules
      within Tracepoints. We use the letter 'X' as a taint flag character for
      a module being loaded that doesn't know how to sign its name (proposed
      by Steven Rostedt).
      
      Also add the missing 'O' entry to trace event show_module_flags() list
      for the sake of completeness.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      NAKed-by: NIngo Molnar <mingo@redhat.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: David Howells <dhowells@redhat.com>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      66cc69e3
  12. 31 1月, 2014 1 次提交
  13. 28 1月, 2014 1 次提交
  14. 25 1月, 2014 1 次提交
  15. 24 1月, 2014 1 次提交
    • K
      kexec: add sysctl to disable kexec_load · 7984754b
      Kees Cook 提交于
      For general-purpose (i.e.  distro) kernel builds it makes sense to build
      with CONFIG_KEXEC to allow end users to choose what kind of things they
      want to do with kexec.  However, in the face of trying to lock down a
      system with such a kernel, there needs to be a way to disable kexec_load
      (much like module loading can be disabled).  Without this, it is too easy
      for the root user to modify kernel memory even when CONFIG_STRICT_DEVMEM
      and modules_disabled are set.  With this change, it is still possible to
      load an image for use later, then disable kexec_load so the image (or lack
      of image) can't be altered.
      
      The intention is for using this in environments where "perfect"
      enforcement is hard.  Without a verified boot, along with verified
      modules, and along with verified kexec, this is trying to give a system a
      better chance to defend itself (or at least grow the window of
      discoverability) against attack in the face of a privilege escalation.
      
      In my mind, I consider several boot scenarios:
      
      1) Verified boot of read-only verified root fs loading fd-based
         verification of kexec images.
      2) Secure boot of writable root fs loading signed kexec images.
      3) Regular boot loading kexec (e.g. kcrash) image early and locking it.
      4) Regular boot with no control of kexec image at all.
      
      1 and 2 don't exist yet, but will soon once the verified kexec series has
      landed.  4 is the state of things now.  The gap between 2 and 4 is too
      large, so this change creates scenario 3, a middle-ground above 4 when 2
      and 1 are not possible for a system.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7984754b
  16. 17 12月, 2013 1 次提交
  17. 13 11月, 2013 1 次提交
    • R
      vsprintf: check real user/group id for %pK · 312b4e22
      Ryan Mallon 提交于
      Some setuid binaries will allow reading of files which have read
      permission by the real user id.  This is problematic with files which
      use %pK because the file access permission is checked at open() time,
      but the kptr_restrict setting is checked at read() time.  If a setuid
      binary opens a %pK file as an unprivileged user, and then elevates
      permissions before reading the file, then kernel pointer values may be
      leaked.
      
      This happens for example with the setuid pppd application on Ubuntu 12.04:
      
        $ head -1 /proc/kallsyms
        00000000 T startup_32
      
        $ pppd file /proc/kallsyms
        pppd: In file /proc/kallsyms: unrecognized option 'c1000000'
      
      This will only leak the pointer value from the first line, but other
      setuid binaries may leak more information.
      
      Fix this by adding a check that in addition to the current process having
      CAP_SYSLOG, that effective user and group ids are equal to the real ids.
      If a setuid binary reads the contents of a file which uses %pK then the
      pointer values will be printed as NULL if the real user is unprivileged.
      
      Update the sysctl documentation to reflect the changes, and also correct
      the documentation to state the kptr_restrict=0 is the default.
      
      This is a only temporary solution to the issue.  The correct solution is
      to do the permission check at open() time on files, and to replace %pK
      with a function which checks the open() time permission.  %pK uses in
      printk should be removed since no sane permission check can be done, and
      instead protected by using dmesg_restrict.
      Signed-off-by: NRyan Mallon <rmallon@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Joe Perches <joe@perches.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      312b4e22
  18. 09 10月, 2013 5 次提交
  19. 12 9月, 2013 1 次提交
  20. 23 6月, 2013 1 次提交
    • D
      perf: Drop sample rate when sampling is too slow · 14c63f17
      Dave Hansen 提交于
      This patch keeps track of how long perf's NMI handler is taking,
      and also calculates how many samples perf can take a second.  If
      the sample length times the expected max number of samples
      exceeds a configurable threshold, it drops the sample rate.
      
      This way, we don't have a runaway sampling process eating up the
      CPU.
      
      This patch can tend to drop the sample rate down to level where
      perf doesn't work very well.  *BUT* the alternative is that my
      system hangs because it spends all of its time handling NMIs.
      
      I'll take a busted performance tool over an entire system that's
      busted and undebuggable any day.
      
      BTW, my suspicion is that there's still an underlying bug here.
      Using the HPET instead of the TSC is definitely a contributing
      factor, but I suspect there are some other things going on.
      But, I can't go dig down on a bug like that with my machine
      hanging all the time.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Cc: Dave Hansen <dave@sr71.net>
      [ Prettified it a bit. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      14c63f17
  21. 28 5月, 2013 2 次提交
  22. 05 1月, 2013 2 次提交
  23. 06 10月, 2012 1 次提交
  24. 07 2月, 2012 1 次提交
  25. 13 1月, 2012 1 次提交
  26. 05 12月, 2011 1 次提交
  27. 01 11月, 2011 1 次提交
    • D
      kernel/sysctl.c: add cap_last_cap to /proc/sys/kernel · 73efc039
      Dan Ballard 提交于
      Userspace needs to know the highest valid capability of the running
      kernel, which right now cannot reliably be retrieved from the header files
      only.  The fact that this value cannot be determined properly right now
      creates various problems for libraries compiled on newer header files
      which are run on older kernels.  They assume capabilities are available
      which actually aren't.  libcap-ng is one example.  And we ran into the
      same problem with systemd too.
      
      Now the capability is exported in /proc/sys/kernel/cap_last_cap.
      
      [akpm@linux-foundation.org: make cap_last_cap const, per Ulrich]
      Signed-off-by: NDan Ballard <dan@mindstab.net>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Lennart Poettering <lennart@poettering.net>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Ulrich Drepper <drepper@akkadia.org>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73efc039
  28. 27 7月, 2011 1 次提交
    • V
      ipc: introduce shm_rmid_forced sysctl · b34a6b1d
      Vasiliy Kulikov 提交于
      Add support for the shm_rmid_forced sysctl.  If set to 1, all shared
      memory objects in current ipc namespace will be automatically forced to
      use IPC_RMID.
      
      The POSIX way of handling shmem allows one to create shm objects and
      call shmdt(), leaving shm object associated with no process, thus
      consuming memory not counted via rlimits.
      
      With shm_rmid_forced=1 the shared memory object is counted at least for
      one process, so OOM killer may effectively kill the fat process holding
      the shared memory.
      
      It obviously breaks POSIX - some programs relying on the feature would
      stop working.  So set shm_rmid_forced=1 only if you're sure nobody uses
      "orphaned" memory.  Use shm_rmid_forced=0 by default for compatability
      reasons.
      
      The feature was previously impemented in -ow as a configure option.
      
      [akpm@linux-foundation.org: fix documentation, per Randy]
      [akpm@linux-foundation.org: fix warning]
      [akpm@linux-foundation.org: readability/conventionality tweaks]
      [akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout]
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Serge E. Hallyn" <serge.hallyn@canonical.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Solar Designer <solar@openwall.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b34a6b1d
  29. 24 7月, 2011 1 次提交
  30. 27 5月, 2011 1 次提交
    • J
      coredump: add support for exe_file in core name · 57cc083a
      Jiri Slaby 提交于
      Now, exe_file is not proc FS dependent, so we can use it to name core
      file.  So we add %E pattern for core file name cration which extract path
      from mm_struct->exe_file.  Then it converts slashes to exclamation marks
      and pastes the result to the core file name itself.
      
      This is useful for environments where binary names are longer than 16
      character (the current->comm limitation).  Also where there are binaries
      with same name but in a different path.  Further in case the binery itself
      changes its current->comm after exec.
      
      So by doing (s/$/#/ -- # is treated as git comment):
      
        $ sysctl kernel.core_pattern='core.%p.%e.%E'
        $ ln /bin/cat cat45678901234567890
        $ ./cat45678901234567890
        ^Z
        $ rm cat45678901234567890
        $ fg
        ^\Quit (core dumped)
        $ ls core*
      
      we now get:
      
        core.2434.cat456789012345.!root!cat45678901234567890 (deleted)
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Reviewed-by: NAndi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57cc083a
  31. 11 2月, 2011 1 次提交
  32. 14 1月, 2011 1 次提交
    • D
      kptr_restrict for hiding kernel pointers from unprivileged users · 455cd5ab
      Dan Rosenberg 提交于
      Add the %pK printk format specifier and the /proc/sys/kernel/kptr_restrict
      sysctl.
      
      The %pK format specifier is designed to hide exposed kernel pointers,
      specifically via /proc interfaces.  Exposing these pointers provides an
      easy target for kernel write vulnerabilities, since they reveal the
      locations of writable structures containing easily triggerable function
      pointers.  The behavior of %pK depends on the kptr_restrict sysctl.
      
      If kptr_restrict is set to 0, no deviation from the standard %p behavior
      occurs.  If kptr_restrict is set to 1, the default, if the current user
      (intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
      (currently in the LSM tree), kernel pointers using %pK are printed as 0's.
       If kptr_restrict is set to 2, kernel pointers using %pK are printed as
      0's regardless of privileges.  Replacing with 0's was chosen over the
      default "(null)", which cannot be parsed by userland %p, which expects
      "(nil)".
      
      [akpm@linux-foundation.org: check for IRQ context when !kptr_restrict, save an indent level, s/WARN/WARN_ONCE/]
      [akpm@linux-foundation.org: coding-style fixup]
      [randy.dunlap@oracle.com: fix kernel/sysctl.c warning]
      Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Thomas Graf <tgraf@infradead.org>
      Cc: Eugene Teo <eugeneteo@kernel.org>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Eric Paris <eparis@parisplace.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      455cd5ab
  33. 09 12月, 2010 1 次提交
  34. 12 11月, 2010 1 次提交