1. 11 3月, 2017 1 次提交
    • A
      selinux: check for address length in selinux_socket_bind() · e2f586bd
      Alexander Potapenko 提交于
      KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
      uninitialized memory in selinux_socket_bind():
      
      ==================================================================
      BUG: KMSAN: use of unitialized memory
      inter: 0
      CPU: 3 PID: 1074 Comm: packet2 Tainted: G    B           4.8.0-rc6+ #1916
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       0000000000000000 ffff8800882ffb08 ffffffff825759c8 ffff8800882ffa48
       ffffffff818bf551 ffffffff85bab870 0000000000000092 ffffffff85bab550
       0000000000000000 0000000000000092 00000000bb0009bb 0000000000000002
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff825759c8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
       [<ffffffff818bdee6>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1008
       [<ffffffff818bf0fb>] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424
       [<ffffffff822dae71>] selinux_socket_bind+0xf41/0x1080 security/selinux/hooks.c:4288
       [<ffffffff8229357c>] security_socket_bind+0x1ec/0x240 security/security.c:1240
       [<ffffffff84265d98>] SYSC_bind+0x358/0x5f0 net/socket.c:1366
       [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
       [<ffffffff8518217c>] entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.o:?
      chained origin: 00000000ba6009bb
       [<ffffffff810bb7a7>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67
       [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
       [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:337
       [<ffffffff818bd2b8>] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:530
       [<ffffffff818bf033>] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380
       [<ffffffff84265b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
       [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
       [<ffffffff8518217c>] return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.o:?
      origin description: ----address@SYSC_bind (origin=00000000b8c00900)
      ==================================================================
      
      (the line numbers are relative to 4.8-rc6, but the bug persists upstream)
      
      , when I run the following program as root:
      
      =======================================================
        #include <string.h>
        #include <sys/socket.h>
        #include <netinet/in.h>
      
        int main(int argc, char *argv[]) {
          struct sockaddr addr;
          int size = 0;
          if (argc > 1) {
            size = atoi(argv[1]);
          }
          memset(&addr, 0, sizeof(addr));
          int fd = socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP);
          bind(fd, &addr, size);
          return 0;
        }
      =======================================================
      
      (for different values of |size| other error reports are printed).
      
      This happens because bind() unconditionally copies |size| bytes of
      |addr| to the kernel, leaving the rest uninitialized. Then
      security_socket_bind() reads the IP address bytes, including the
      uninitialized ones, to determine the port, or e.g. pass them further to
      sel_netnode_find(), which uses them to calculate a hash.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      [PM: fixed some whitespace damage]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      e2f586bd
  2. 06 3月, 2017 3 次提交
    • J
      security: mark LSM hooks as __ro_after_init · ca97d939
      James Morris 提交于
      Mark all of the registration hooks as __ro_after_init (via the
      __lsm_ro_after_init macro).
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NKees Cook <keescook@chromium.org>
      ca97d939
    • S
      selinux: fix kernel BUG on prlimit(..., NULL, NULL) · 84e6885e
      Stephen Smalley 提交于
      commit 79bcf325e6b32b3c ("prlimit,security,selinux: add a security hook
      for prlimit") introduced a security hook for prlimit() and implemented it
      for SELinux.  However, if prlimit() is called with NULL arguments for both
      the new limit and the old limit, then the hook is called with 0 for the
      read/write flags, since the prlimit() will neither read nor write the
      process' limits.  This would in turn lead to calling avc_has_perm() with 0
      for the requested permissions, which triggers a BUG_ON() in
      avc_has_perm_noaudit() since the kernel should never be invoking
      avc_has_perm() with no permissions.  Fix this in the SELinux hook by
      returning immediately if the flags are 0.  Arguably prlimit64() itself
      ought to return immediately if both old_rlim and new_rlim are NULL since
      it is effectively a no-op in that case.
      
      Reported by the lkp-robot based on trinity testing.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      84e6885e
    • S
      prlimit,security,selinux: add a security hook for prlimit · 791ec491
      Stephen Smalley 提交于
      When SELinux was first added to the kernel, a process could only get
      and set its own resource limits via getrlimit(2) and setrlimit(2), so no
      MAC checks were required for those operations, and thus no security hooks
      were defined for them. Later, SELinux introduced a hook for setlimit(2)
      with a check if the hard limit was being changed in order to be able to
      rely on the hard limit value as a safe reset point upon context
      transitions.
      
      Later on, when prlimit(2) was added to the kernel with the ability to get
      or set resource limits (hard or soft) of another process, LSM/SELinux was
      not updated other than to pass the target process to the setrlimit hook.
      This resulted in incomplete control over both getting and setting the
      resource limits of another process.
      
      Add a new security_task_prlimit() hook to the check_prlimit_permission()
      function to provide complete mediation.  The hook is only called when
      acting on another task, and only if the existing DAC/capability checks
      would allow access.  Pass flags down to the hook to indicate whether the
      prlimit(2) call will read, write, or both read and write the resource
      limits of the target process.
      
      The existing security_task_setrlimit() hook is left alone; it continues
      to serve a purpose in supporting the ability to make decisions based on
      the old and/or new resource limit values when setting limits.  This
      is consistent with the DAC/capability logic, where
      check_prlimit_permission() performs generic DAC/capability checks for
      acting on another task, while do_prlimit() performs a capability check
      based on a comparison of the old and new resource limits.  Fix the
      inline documentation for the hook to match the code.
      
      Implement the new hook for SELinux.  For setting resource limits, we
      reuse the existing setrlimit permission.  Note that this does overload
      the setrlimit permission to mean the ability to set the resource limit
      (soft or hard) of another process or the ability to change one's own
      hard limit.  For getting resource limits, a new getrlimit permission
      is defined.  This was not originally defined since getrlimit(2) could
      only be used to obtain a process' own limits.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      791ec491
  3. 02 3月, 2017 3 次提交
  4. 08 2月, 2017 3 次提交
    • S
      selinux: fix off-by-one in setprocattr · 0c461cb7
      Stephen Smalley 提交于
      SELinux tries to support setting/clearing of /proc/pid/attr attributes
      from the shell by ignoring terminating newlines and treating an
      attribute value that begins with a NUL or newline as an attempt to
      clear the attribute.  However, the test for clearing attributes has
      always been wrong; it has an off-by-one error, and this could further
      lead to reading past the end of the allocated buffer since commit
      bb646cdb ("proc_pid_attr_write():
      switch to memdup_user()").  Fix the off-by-one error.
      
      Even with this fix, setting and clearing /proc/pid/attr attributes
      from the shell is not straightforward since the interface does not
      support multiple write() calls (so shells that write the value and
      newline separately will set and then immediately clear the attribute,
      requiring use of echo -n to set the attribute), whereas trying to use
      echo -n "" to clear the attribute causes the shell to skip the
      write() call altogether since POSIX says that a zero-length write
      causes no side effects. Thus, one must use echo -n to set and echo
      without -n to clear, as in the following example:
      $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      unconfined_u:object_r:user_home_t:s0
      $ echo "" > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      
      Note the use of /proc/$$ rather than /proc/self, as otherwise
      the cat command will read its own attribute value, not that of the shell.
      
      There are no users of this facility to my knowledge; possibly we
      should just get rid of it.
      
      UPDATE: Upon further investigation it appears that a local process
      with the process:setfscreate permission can cause a kernel panic as a
      result of this bug.  This patch fixes CVE-2017-2618.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      [PM: added the update about CVE-2017-2618 to the commit description]
      Cc: stable@vger.kernel.org # 3.5: d6ea83ecSigned-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      0c461cb7
    • A
      selinux: allow changing labels for cgroupfs · 1ea0ce40
      Antonio Murdaca 提交于
      This patch allows changing labels for cgroup mounts. Previously, running
      chcon on cgroupfs would throw an "Operation not supported". This patch
      specifically whitelist cgroupfs.
      
      The patch could also allow containers to write only to the systemd cgroup
      for instance, while the other cgroups are kept with cgroup_t label.
      Signed-off-by: NAntonio Murdaca <runcom@redhat.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      1ea0ce40
    • S
      selinux: fix off-by-one in setprocattr · a050a570
      Stephen Smalley 提交于
      SELinux tries to support setting/clearing of /proc/pid/attr attributes
      from the shell by ignoring terminating newlines and treating an
      attribute value that begins with a NUL or newline as an attempt to
      clear the attribute.  However, the test for clearing attributes has
      always been wrong; it has an off-by-one error, and this could further
      lead to reading past the end of the allocated buffer since commit
      bb646cdb ("proc_pid_attr_write():
      switch to memdup_user()").  Fix the off-by-one error.
      
      Even with this fix, setting and clearing /proc/pid/attr attributes
      from the shell is not straightforward since the interface does not
      support multiple write() calls (so shells that write the value and
      newline separately will set and then immediately clear the attribute,
      requiring use of echo -n to set the attribute), whereas trying to use
      echo -n "" to clear the attribute causes the shell to skip the
      write() call altogether since POSIX says that a zero-length write
      causes no side effects. Thus, one must use echo -n to set and echo
      without -n to clear, as in the following example:
      $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      unconfined_u:object_r:user_home_t:s0
      $ echo "" > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      
      Note the use of /proc/$$ rather than /proc/self, as otherwise
      the cat command will read its own attribute value, not that of the shell.
      
      There are no users of this facility to my knowledge; possibly we
      should just get rid of it.
      
      UPDATE: Upon further investigation it appears that a local process
      with the process:setfscreate permission can cause a kernel panic as a
      result of this bug.  This patch fixes CVE-2017-2618.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      [PM: added the update about CVE-2017-2618 to the commit description]
      Cc: stable@vger.kernel.org # 3.5: d6ea83ecSigned-off-by: NPaul Moore <paul@paul-moore.com>
      a050a570
  5. 25 1月, 2017 1 次提交
    • K
      Introduce a sysctl that modifies the value of PROT_SOCK. · 4548b683
      Krister Johansen 提交于
      Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
      that denotes the first unprivileged inet port in the namespace.  To
      disable all privileged ports set this to zero.  It also checks for
      overlap with the local port range.  The privileged and local range may
      not overlap.
      
      The use case for this change is to allow containerized processes to bind
      to priviliged ports, but prevent them from ever being allowed to modify
      their container's network configuration.  The latter is accomplished by
      ensuring that the network namespace is not a child of the user
      namespace.  This modification was needed to allow the container manager
      to disable a namespace's priviliged port restrictions without exposing
      control of the network namespace to processes in the user namespace.
      Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4548b683
  6. 24 1月, 2017 1 次提交
  7. 19 1月, 2017 1 次提交
  8. 13 1月, 2017 2 次提交
  9. 09 1月, 2017 6 次提交
    • S
      proc,security: move restriction on writing /proc/pid/attr nodes to proc · b21507e2
      Stephen Smalley 提交于
      Processes can only alter their own security attributes via
      /proc/pid/attr nodes.  This is presently enforced by each individual
      security module and is also imposed by the Linux credentials
      implementation, which only allows a task to alter its own credentials.
      Move the check enforcing this restriction from the individual
      security modules to proc_pid_attr_write() before calling the security hook,
      and drop the unnecessary task argument to the security hook since it can
      only ever be the current task.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      b21507e2
    • S
      selinux: clean up cred usage and simplify · be0554c9
      Stephen Smalley 提交于
      SELinux was sometimes using the task "objective" credentials when
      it could/should use the "subjective" credentials.  This was sometimes
      hidden by the fact that we were unnecessarily passing around pointers
      to the current task, making it appear as if the task could be something
      other than current, so eliminate all such passing of current.  Inline
      various permission checking helper functions that can be reduced to a
      single avc_has_perm() call.
      
      Since the credentials infrastructure only allows a task to alter
      its own credentials, we can always assume that current must be the same
      as the target task in selinux_setprocattr after the check. We likely
      should move this check from selinux_setprocattr() to proc_pid_attr_write()
      and drop the task argument to the security hook altogether; it can only
      serve to confuse things.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      be0554c9
    • S
      selinux: allow context mounts on tmpfs, ramfs, devpts within user namespaces · 01593d32
      Stephen Smalley 提交于
      commit aad82892 ("selinux: Add support for
      unprivileged mounts from user namespaces") prohibited any use of context
      mount options within non-init user namespaces.  However, this breaks
      use of context mount options for tmpfs mounts within user namespaces,
      which are being used by Docker/runc.  There is no reason to block such
      usage for tmpfs, ramfs or devpts.  Exempt these filesystem types
      from this restriction.
      
      Before:
      sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
      sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
      mount: tmpfs is write-protected, mounting read-only
      mount: cannot mount tmpfs read-only
      
      After:
      sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
      sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
      sh# ls -Zd /tmp
      unconfined_u:object_r:user_tmp_t:s0:c13 /tmp
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      01593d32
    • S
      selinux: handle ICMPv6 consistently with ICMP · ef37979a
      Stephen Smalley 提交于
      commit 79c8b348f215 ("selinux: support distinctions among all network
      address families") mapped datagram ICMP sockets to the new icmp_socket
      security class, but left ICMPv6 sockets unchanged.  This change fixes
      that oversight to handle both kinds of sockets consistently.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      ef37979a
    • Y
      selinux: add security in-core xattr support for tracefs · a2c7c6fb
      Yongqin Liu 提交于
      Since kernel 4.1 ftrace is supported as a new separate filesystem. It
      gets automatically mounted by the kernel under the old path
      /sys/kernel/debug/tracing. Because it lives now on a separate filesystem
      SELinux needs to be updated to also support setting SELinux labels
      on tracefs inodes.  This is required for compatibility in Android
      when moving to Linux 4.1 or newer.
      Signed-off-by: NYongqin Liu <yongqin.liu@linaro.org>
      Signed-off-by: NWilliam Roberts <william.c.roberts@intel.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      a2c7c6fb
    • S
      selinux: support distinctions among all network address families · da69a530
      Stephen Smalley 提交于
      Extend SELinux to support distinctions among all network address families
      implemented by the kernel by defining new socket security classes
      and mapping to them. Otherwise, many sockets are mapped to the generic
      socket class and are indistinguishable in policy.  This has come up
      previously with regard to selectively allowing access to bluetooth sockets,
      and more recently with regard to selectively allowing access to AF_ALG
      sockets.  Guido Trentalancia submitted a patch that took a similar approach
      to add only support for distinguishing AF_ALG sockets, but this generalizes
      his approach to handle all address families implemented by the kernel.
      Socket security classes are also added for ICMP and SCTP sockets.
      Socket security classes were not defined for AF_* values that are reserved
      but unimplemented in the kernel, e.g. AF_NETBEUI, AF_SECURITY, AF_ASH,
      AF_ECONET, AF_SNA, AF_WANPIPE.
      
      Backward compatibility is provided by only enabling the finer-grained
      socket classes if a new policy capability is set in the policy; older
      policies will behave as before.  The legacy redhat1 policy capability
      that was only ever used in testing within Fedora for ptrace_child
      is reclaimed for this purpose; as far as I can tell, this policy
      capability is not enabled in any supported distro policy.
      
      Add a pair of conditional compilation guards to detect when new AF_* values
      are added so that we can update SELinux accordingly rather than having to
      belatedly update it long after new address families are introduced.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      da69a530
  10. 23 11月, 2016 1 次提交
    • A
      selinux: Convert isec->lock into a spinlock · 9287aed2
      Andreas Gruenbacher 提交于
      Convert isec->lock from a mutex into a spinlock.  Instead of holding
      the lock while sleeping in inode_doinit_with_dentry, set
      isec->initialized to LABEL_PENDING and release the lock.  Then, when
      the sid has been determined, re-acquire the lock.  If isec->initialized
      is still set to LABEL_PENDING, set isec->sid; otherwise, the sid has
      been set by another task (LABEL_INITIALIZED) or invalidated
      (LABEL_INVALID) in the meantime.
      
      This fixes a deadlock on gfs2 where
      
       * one task is in inode_doinit_with_dentry -> gfs2_getxattr, holds
         isec->lock, and tries to acquire the inode's glock, and
      
       * another task is in do_xmote -> inode_go_inval ->
         selinux_inode_invalidate_secctx, holds the inode's glock, and
         tries to acquire isec->lock.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      [PM: minor tweaks to keep checkpatch.pl happy]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      9287aed2
  11. 16 11月, 2016 1 次提交
  12. 15 11月, 2016 4 次提交
  13. 20 10月, 2016 1 次提交
  14. 08 10月, 2016 1 次提交
  15. 20 9月, 2016 1 次提交
    • V
      lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE · 43af5de7
      Vivek Goyal 提交于
      Right now LSM_AUDIT_DATA_PATH type contains "struct path" in union "u"
      of common_audit_data. This information is used to print path of file
      at the same time it is also used to get to dentry and inode. And this
      inode information is used to get to superblock and device and print
      device information.
      
      This does not work well for layered filesystems like overlay where dentry
      contained in path is overlay dentry and not the real dentry of underlying
      file system. That means inode retrieved from dentry is also overlay
      inode and not the real inode.
      
      SELinux helpers like file_path_has_perm() are doing checks on inode
      retrieved from file_inode(). This returns the real inode and not the
      overlay inode. That means we are doing check on real inode but for audit
      purposes we are printing details of overlay inode and that can be
      confusing while debugging.
      
      Hence, introduce a new type LSM_AUDIT_DATA_FILE which carries file
      information and inode retrieved is real inode using file_inode(). That
      way right avc denied information is given to user.
      
      For example, following is one example avc before the patch.
      
        type=AVC msg=audit(1473360868.399:214): avc:  denied  { read open } for
          pid=1765 comm="cat"
          path="/root/.../overlay/container1/merged/readfile"
          dev="overlay" ino=21443
          scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
          tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
          tclass=file permissive=0
      
      It looks as follows after the patch.
      
        type=AVC msg=audit(1473360017.388:282): avc:  denied  { read open } for
          pid=2530 comm="cat"
          path="/root/.../overlay/container1/merged/readfile"
          dev="dm-0" ino=2377915
          scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
          tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
          tclass=file permissive=0
      
      Notice that now dev information points to "dm-0" device instead of
      "overlay" device. This makes it clear that check failed on underlying
      inode and not on the overlay inode.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      [PM: slight tweaks to the description to make checkpatch.pl happy]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      43af5de7
  16. 10 8月, 2016 1 次提交
  17. 09 8月, 2016 4 次提交
  18. 21 7月, 2016 1 次提交
  19. 28 6月, 2016 2 次提交
  20. 25 6月, 2016 1 次提交
  21. 24 6月, 2016 1 次提交
    • A
      fs: Treat foreign mounts as nosuid · 380cf5ba
      Andy Lutomirski 提交于
      If a process gets access to a mount from a different user
      namespace, that process should not be able to take advantage of
      setuid files or selinux entrypoints from that filesystem.  Prevent
      this by treating mounts from other mount namespaces and those not
      owned by current_user_ns() or an ancestor as nosuid.
      
      This will make it safer to allow more complex filesystems to be
      mounted in non-root user namespaces.
      
      This does not remove the need for MNT_LOCK_NOSUID.  The setuid,
      setgid, and file capability bits can no longer be abused if code in
      a user namespace were to clear nosuid on an untrusted filesystem,
      but this patch, by itself, is insufficient to protect the system
      from abuse of files that, when execed, would increase MAC privilege.
      
      As a more concrete explanation, any task that can manipulate a
      vfsmount associated with a given user namespace already has
      capabilities in that namespace and all of its descendents.  If they
      can cause a malicious setuid, setgid, or file-caps executable to
      appear in that mount, then that executable will only allow them to
      elevate privileges in exactly the set of namespaces in which they
      are already privileges.
      
      On the other hand, if they can cause a malicious executable to
      appear with a dangerous MAC label, running it could change the
      caller's security context in a way that should not have been
      possible, even inside the namespace in which the task is confined.
      
      As a hardening measure, this would have made CVE-2014-5207 much
      more difficult to exploit.
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
      Acked-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      380cf5ba