1. 09 9月, 2015 1 次提交
  2. 05 9月, 2015 2 次提交
    • A
      userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP · 16ba6f81
      Andrea Arcangeli 提交于
      These two flags gets set in vma->vm_flags to tell the VM common code
      if the userfaultfd is armed and in which mode (only tracking missing
      faults, only tracking wrprotect faults or both). If neither flags is
      set it means the userfaultfd is not armed on the vma.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
      Cc: zhang.zhanghailiang@huawei.com
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      16ba6f81
    • A
      capabilities: ambient capabilities · 58319057
      Andy Lutomirski 提交于
      Credit where credit is due: this idea comes from Christoph Lameter with
      a lot of valuable input from Serge Hallyn.  This patch is heavily based
      on Christoph's patch.
      
      ===== The status quo =====
      
      On Linux, there are a number of capabilities defined by the kernel.  To
      perform various privileged tasks, processes can wield capabilities that
      they hold.
      
      Each task has four capability masks: effective (pE), permitted (pP),
      inheritable (pI), and a bounding set (X).  When the kernel checks for a
      capability, it checks pE.  The other capability masks serve to modify
      what capabilities can be in pE.
      
      Any task can remove capabilities from pE, pP, or pI at any time.  If a
      task has a capability in pP, it can add that capability to pE and/or pI.
      If a task has CAP_SETPCAP, then it can add any capability to pI, and it
      can remove capabilities from X.
      
      Tasks are not the only things that can have capabilities; files can also
      have capabilities.  A file can have no capabilty information at all [1].
      If a file has capability information, then it has a permitted mask (fP)
      and an inheritable mask (fI) as well as a single effective bit (fE) [2].
      File capabilities modify the capabilities of tasks that execve(2) them.
      
      A task that successfully calls execve has its capabilities modified for
      the file ultimately being excecuted (i.e.  the binary itself if that
      binary is ELF or for the interpreter if the binary is a script.) [3] In
      the capability evolution rules, for each mask Z, pZ represents the old
      value and pZ' represents the new value.  The rules are:
      
        pP' = (X & fP) | (pI & fI)
        pI' = pI
        pE' = (fE ? pP' : 0)
        X is unchanged
      
      For setuid binaries, fP, fI, and fE are modified by a moderately
      complicated set of rules that emulate POSIX behavior.  Similarly, if
      euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
      (primary, fP and fI usually end up being the full set).  For nonroot
      users executing binaries with neither setuid nor file caps, fI and fP
      are empty and fE is false.
      
      As an extra complication, if you execute a process as nonroot and fE is
      set, then the "secure exec" rules are in effect: AT_SECURE gets set,
      LD_PRELOAD doesn't work, etc.
      
      This is rather messy.  We've learned that making any changes is
      dangerous, though: if a new kernel version allows an unprivileged
      program to change its security state in a way that persists cross
      execution of a setuid program or a program with file caps, this
      persistent state is surprisingly likely to allow setuid or file-capped
      programs to be exploited for privilege escalation.
      
      ===== The problem =====
      
      Capability inheritance is basically useless.
      
      If you aren't root and you execute an ordinary binary, fI is zero, so
      your capabilities have no effect whatsoever on pP'.  This means that you
      can't usefully execute a helper process or a shell command with elevated
      capabilities if you aren't root.
      
      On current kernels, you can sort of work around this by setting fI to
      the full set for most or all non-setuid executable files.  This causes
      pP' = pI for nonroot, and inheritance works.  No one does this because
      it's a PITA and it isn't even supported on most filesystems.
      
      If you try this, you'll discover that every nonroot program ends up with
      secure exec rules, breaking many things.
      
      This is a problem that has bitten many people who have tried to use
      capabilities for anything useful.
      
      ===== The proposed change =====
      
      This patch adds a fifth capability mask called the ambient mask (pA).
      pA does what most people expect pI to do.
      
      pA obeys the invariant that no bit can ever be set in pA if it is not
      set in both pP and pI.  Dropping a bit from pP or pI drops that bit from
      pA.  This ensures that existing programs that try to drop capabilities
      still do so, with a complication.  Because capability inheritance is so
      broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
      then calling execve effectively drops capabilities.  Therefore,
      setresuid from root to nonroot conditionally clears pA unless
      SECBIT_NO_SETUID_FIXUP is set.  Processes that don't like this can
      re-add bits to pA afterwards.
      
      The capability evolution rules are changed:
      
        pA' = (file caps or setuid or setgid ? 0 : pA)
        pP' = (X & fP) | (pI & fI) | pA'
        pI' = pI
        pE' = (fE ? pP' : pA')
        X is unchanged
      
      If you are nonroot but you have a capability, you can add it to pA.  If
      you do so, your children get that capability in pA, pP, and pE.  For
      example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
      automatically bind low-numbered ports.  Hallelujah!
      
      Unprivileged users can create user namespaces, map themselves to a
      nonzero uid, and create both privileged (relative to their namespace)
      and unprivileged process trees.  This is currently more or less
      impossible.  Hallelujah!
      
      You cannot use pA to try to subvert a setuid, setgid, or file-capped
      program: if you execute any such program, pA gets cleared and the
      resulting evolution rules are unchanged by this patch.
      
      Users with nonzero pA are unlikely to unintentionally leak that
      capability.  If they run programs that try to drop privileges, dropping
      privileges will still work.
      
      It's worth noting that the degree of paranoia in this patch could
      possibly be reduced without causing serious problems.  Specifically, if
      we allowed pA to persist across executing non-pA-aware setuid binaries
      and across setresuid, then, naively, the only capabilities that could
      leak as a result would be the capabilities in pA, and any attacker
      *already* has those capabilities.  This would make me nervous, though --
      setuid binaries that tried to privilege-separate might fail to do so,
      and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
      unexpected side effects.  (Whether these unexpected side effects would
      be exploitable is an open question.) I've therefore taken the more
      paranoid route.  We can revisit this later.
      
      An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
      ambient capabilities.  I think that this would be annoying and would
      make granting otherwise unprivileged users minor ambient capabilities
      (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
      it is with this patch.
      
      ===== Footnotes =====
      
      [1] Files that are missing the "security.capability" xattr or that have
      unrecognized values for that xattr end up with has_cap set to false.
      The code that does that appears to be complicated for no good reason.
      
      [2] The libcap capability mask parsers and formatters are dangerously
      misleading and the documentation is flat-out wrong.  fE is *not* a mask;
      it's a single bit.  This has probably confused every single person who
      has tried to use file capabilities.
      
      [3] Linux very confusingly processes both the script and the interpreter
      if applicable, for reasons that elude me.  The results from thinking
      about a script's file capabilities and/or setuid bits are mostly
      discarded.
      
      Preliminary userspace code is here, but it needs updating:
      https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2
      
      Here is a test program that can be used to verify the functionality
      (from Christoph):
      
      /*
       * Test program for the ambient capabilities. This program spawns a shell
       * that allows running processes with a defined set of capabilities.
       *
       * (C) 2015 Christoph Lameter <cl@linux.com>
       * Released under: GPL v3 or later.
       *
       *
       * Compile using:
       *
       *	gcc -o ambient_test ambient_test.o -lcap-ng
       *
       * This program must have the following capabilities to run properly:
       * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
       *
       * A command to equip the binary with the right caps is:
       *
       *	setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
       *
       *
       * To get a shell with additional caps that can be inherited by other processes:
       *
       *	./ambient_test /bin/bash
       *
       *
       * Verifying that it works:
       *
       * From the bash spawed by ambient_test run
       *
       *	cat /proc/$$/status
       *
       * and have a look at the capabilities.
       */
      
      #include <stdlib.h>
      #include <stdio.h>
      #include <errno.h>
      #include <cap-ng.h>
      #include <sys/prctl.h>
      #include <linux/capability.h>
      
      /*
       * Definitions from the kernel header files. These are going to be removed
       * when the /usr/include files have these defined.
       */
      #define PR_CAP_AMBIENT 47
      #define PR_CAP_AMBIENT_IS_SET 1
      #define PR_CAP_AMBIENT_RAISE 2
      #define PR_CAP_AMBIENT_LOWER 3
      #define PR_CAP_AMBIENT_CLEAR_ALL 4
      
      static void set_ambient_cap(int cap)
      {
      	int rc;
      
      	capng_get_caps_process();
      	rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
      	if (rc) {
      		printf("Cannot add inheritable cap\n");
      		exit(2);
      	}
      	capng_apply(CAPNG_SELECT_CAPS);
      
      	/* Note the two 0s at the end. Kernel checks for these */
      	if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
      		perror("Cannot set cap");
      		exit(1);
      	}
      }
      
      int main(int argc, char **argv)
      {
      	int rc;
      
      	set_ambient_cap(CAP_NET_RAW);
      	set_ambient_cap(CAP_NET_ADMIN);
      	set_ambient_cap(CAP_SYS_NICE);
      
      	printf("Ambient_test forking shell\n");
      	if (execv(argv[1], argv + 1))
      		perror("Cannot exec");
      
      	return 0;
      }
      
      Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Aaron Jones <aaronmdjones@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew G. Morgan <morgan@kernel.org>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
      Cc: Markku Savela <msa@moth.iki.fi>
      Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58319057
  3. 18 7月, 2015 4 次提交
  4. 10 7月, 2015 1 次提交
    • E
      vfs: Commit to never having exectuables on proc and sysfs. · 90f8572b
      Eric W. Biederman 提交于
      Today proc and sysfs do not contain any executable files.  Several
      applications today mount proc or sysfs without noexec and nosuid and
      then depend on there being no exectuables files on proc or sysfs.
      Having any executable files show on proc or sysfs would cause
      a user space visible regression, and most likely security problems.
      
      Therefore commit to never allowing executables on proc and sysfs by
      adding a new flag to mark them as filesystems without executables and
      enforce that flag.
      
      Test the flag where MNT_NOEXEC is tested today, so that the only user
      visible effect will be that exectuables will be treated as if the
      execute bit is cleared.
      
      The filesystems proc and sysfs do not currently incoporate any
      executable files so this does not result in any user visible effects.
      
      This makes it unnecessary to vet changes to proc and sysfs tightly for
      adding exectuable files or changes to chattr that would modify
      existing files, as no matter what the individual file say they will
      not be treated as exectuable files by the vfs.
      
      Not having to vet changes to closely is important as without this we
      are only one proc_create call (or another goof up in the
      implementation of notify_change) from having problematic executables
      on proc.  Those mistakes are all too easy to make and would create
      a situation where there are security issues or the assumptions of
      some program having to be broken (and cause userspace regressions).
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      90f8572b
  5. 04 7月, 2015 1 次提交
  6. 01 7月, 2015 2 次提交
  7. 26 6月, 2015 2 次提交
    • I
      fs, proc: introduce CONFIG_PROC_CHILDREN · 2e13ba54
      Iago López Galeiras 提交于
      Commit 81841161 ("fs, proc: introduce /proc/<pid>/task/<tid>/children
      entry") introduced the children entry for checkpoint restore and the
      file is only available on kernels configured with CONFIG_EXPERT and
      CONFIG_CHECKPOINT_RESTORE.
      
      This is available in most distributions (Fedora, Debian, Ubuntu, CoreOS)
      because they usually enable CONFIG_EXPERT and CONFIG_CHECKPOINT_RESTORE.
      But Arch does not enable CONFIG_EXPERT or CONFIG_CHECKPOINT_RESTORE.
      
      However, the children proc file is useful outside of checkpoint restore.
      I would like to use it in rkt.  The rkt process exec() another program
      it does not control, and that other program will fork()+exec() a child
      process.  I would like to find the pid of the child process from an
      external tool without iterating in /proc over all processes to find
      which one has a parent pid equal to rkt.
      
      This commit introduces CONFIG_PROC_CHILDREN and makes
      CONFIG_CHECKPOINT_RESTORE select it.  This allows enabling
      /proc/<pid>/task/<tid>/children without needing to enable
      CONFIG_CHECKPOINT_RESTORE and CONFIG_EXPERT.
      
      Alban tested that /proc/<pid>/task/<tid>/children is present when the
      kernel is configured with CONFIG_PROC_CHILDREN=y but without
      CONFIG_CHECKPOINT_RESTORE
      Signed-off-by: NIago López Galeiras <iago@endocode.com>
      Tested-by: NAlban Crequy <alban@endocode.com>
      Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Djalal Harouni <djalal@endocode.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e13ba54
    • A
      proc: fix PAGE_SIZE limit of /proc/$PID/cmdline · c2c0bb44
      Alexey Dobriyan 提交于
      /proc/$PID/cmdline truncates output at PAGE_SIZE. It is easy to see with
      
      	$ cat /proc/self/cmdline $(seq 1037) 2>/dev/null
      
      However, command line size was never limited to PAGE_SIZE but to 128 KB
      and relatively recently limitation was removed altogether.
      
      People noticed and ask questions:
      http://stackoverflow.com/questions/199130/how-do-i-increase-the-proc-pid-cmdline-4096-byte-limit
      
      seq file interface is not OK, because it kmalloc's for whole output and
      open + read(, 1) + sleep will pin arbitrary amounts of kernel memory.  To
      not do that, limit must be imposed which is incompatible with arbitrary
      sized command lines.
      
      I apologize for hairy code, but this it direct consequence of command line
      layout in memory and hacks to support things like "init [3]".
      
      The loops are "unrolled" otherwise it is either macros which hide control
      flow or functions with 7-8 arguments with equal line count.
      
      There should be real setproctitle(2) or something.
      
      [akpm@linux-foundation.org: fix a billion min() warnings]
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Tested-by: NJarod Wilson <jarod@redhat.com>
      Acked-by: NJarod Wilson <jarod@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Jan Stancek <jstancek@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2c0bb44
  8. 25 6月, 2015 1 次提交
    • C
      procfs: treat parked tasks as sleeping for task state · f51c0eae
      Chris Metcalf 提交于
      Allowing watchdog threads to be parked means that we now have the
      opportunity of actually seeing persistent parked threads in the output
      of /proc/<pid>/stat and /proc/<pid>/status.  The existing code reported
      such threads as "Running", which is kind-of true if you think of the
      case where we park them as part of taking cpus offline.  But if we allow
      parking them indefinitely, "Running" is pretty misleading, so we report
      them as "Sleeping" instead.
      
      We could simply report them with a new string, "Parked", but it feels
      like it's a bit risky for userspace to see unexpected new values; the
      output is already documented in Documentation/filesystems/proc.txt, and
      it seems like a mistake to change that lightly.
      
      The scheduler does report parked tasks with a "P" in debugging output
      from sched_show_task() or dump_cpu_task(), but that's a different API.
      Similarly, the trace_ctxwake_* routines report a "P" for parked tasks,
      but again, different API.
      
      This change seemed slightly cleaner than updating the task_state_array
      to have additional rows.  TASK_DEAD should be subsumed by the exit_state
      bits; TASK_WAKEKILL is just a modifier; and TASK_WAKING can very
      reasonably be reported as "Running" (as it is now).  Only TASK_PARKED
      shows up with unreasonable output here.
      Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f51c0eae
  9. 24 6月, 2015 1 次提交
  10. 14 5月, 2015 1 次提交
    • E
      mnt: Refactor the logic for mounting sysfs and proc in a user namespace · 1b852bce
      Eric W. Biederman 提交于
      Fresh mounts of proc and sysfs are a very special case that works very
      much like a bind mount.  Unfortunately the current structure can not
      preserve the MNT_LOCK... mount flags.  Therefore refactor the logic
      into a form that can be modified to preserve those lock bits.
      
      Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount
      of the filesystem be fully visible in the current mount namespace,
      before the filesystem may be mounted.
      
      Move the logic for calling fs_fully_visible from proc and sysfs into
      fs/namespace.c where it has greater access to mount namespace state.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      1b852bce
  11. 11 5月, 2015 3 次提交
    • A
      switch ->put_link() from dentry to inode · 5f2c4179
      Al Viro 提交于
      only one instance looks at that argument at all; that sole
      exception wants inode rather than dentry.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5f2c4179
    • A
      don't pass nameidata to ->follow_link() · 6e77137b
      Al Viro 提交于
      its only use is getting passed to nd_jump_link(), which can obtain
      it from current->nameidata
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6e77137b
    • A
      new ->follow_link() and ->put_link() calling conventions · 680baacb
      Al Viro 提交于
      a) instead of storing the symlink body (via nd_set_link()) and returning
      an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
      that opaque pointer (into void * passed by address by caller) and returns
      the symlink body.  Returning ERR_PTR() on error, NULL on jump (procfs magic
      symlinks) and pointer to symlink body for normal symlinks.  Stored pointer
      is ignored in all cases except the last one.
      
      Storing NULL for opaque pointer (or not storing it at all) means no call
      of ->put_link().
      
      b) the body used to be passed to ->put_link() implicitly (via nameidata).
      Now only the opaque pointer is.  In the cases when we used the symlink body
      to free stuff, ->follow_link() now should store it as opaque pointer in addition
      to returning it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      680baacb
  12. 17 4月, 2015 1 次提交
    • A
      proc: show locks in /proc/pid/fdinfo/X · 6c8c9031
      Andrey Vagin 提交于
      Let's show locks which are associated with a file descriptor in
      its fdinfo file.
      
      Currently we don't have a reliable way to determine who holds a lock.  We
      can find some information in /proc/locks, but PID which is reported there
      can be wrong.  For example, a process takes a lock, then forks a child and
      dies.  In this case /proc/locks contains the parent pid, which can be
      reused by another process.
      
      $ cat /proc/locks
      ...
      6: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF
      ...
      
      $ ps -C rpcbind
        PID TTY          TIME CMD
        332 ?        00:00:00 rpcbind
      
      $ cat /proc/332/fdinfo/4
      pos:	0
      flags:	0100000
      mnt_id:	22
      lock:	1: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF
      
      $ ls -l /proc/332/fd/4
      lr-x------ 1 root root 64 Mar  5 14:43 /proc/332/fd/4 -> /run/rpcbind.lock
      
      $ ls -l /proc/324/fd/
      total 0
      lrwx------ 1 root root 64 Feb 27 14:50 0 -> /dev/pts/0
      lrwx------ 1 root root 64 Feb 27 14:50 1 -> /dev/pts/0
      lrwx------ 1 root root 64 Feb 27 14:49 2 -> /dev/pts/0
      
      You can see that the process with the 324 pid doesn't hold the lock.
      
      This information is required for proper dumping and restoring file
      locks.
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Acked-by: NJeff Layton <jlayton@poochiereds.net>
      Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6c8c9031
  13. 16 4月, 2015 4 次提交
  14. 18 3月, 2015 1 次提交
  15. 23 2月, 2015 1 次提交
  16. 18 2月, 2015 1 次提交
    • W
      vmcore: fix PT_NOTE n_namesz, n_descsz overflow issue · 34b47764
      WANG Chao 提交于
      When updating PT_NOTE header size (ie.  p_memsz), an overflow issue
      happens with the following bogus note entry:
      
        n_namesz = 0xFFFFFFFF
        n_descsz = 0x0
        n_type   = 0x0
      
      This kind of note entry should be dropped during updating p_memsz.  But
      because n_namesz is 32bit, after (n_namesz + 3) & (~3), it's overflow to
      0x0, the note entry size looks sane and reserved.
      
      When userspace (eg.  crash utility) is trying to access such bogus note,
      it could lead to an unexpected behavior (eg.  crash utility segment fault
      because it's reading bogus address).
      
      The source of bogus note hasn't been identified yet.  At least we could
      drop the bogus note so user space wouldn't be surprised.
      Signed-off-by: NWANG Chao <chaowang@redhat.com>
      Cc: Dave Anderson <anderson@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Randy Wright <rwright@hp.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Rashika Kheria <rashika.kheria@gmail.com>
      Cc: Greg Pearson <greg.pearson@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34b47764
  17. 14 2月, 2015 1 次提交
  18. 13 2月, 2015 4 次提交
  19. 12 2月, 2015 8 次提交