1. 28 2月, 2018 1 次提交
  2. 20 10月, 2017 2 次提交
  3. 17 10月, 2017 1 次提交
  4. 05 10月, 2017 2 次提交
  5. 04 10月, 2017 1 次提交
    • E
      selinux: Perform both commoncap and selinux xattr checks · 6b240306
      Eric W. Biederman 提交于
      When selinux is loaded the relax permission checks for writing
      security.capable are not honored.  Which keeps file capabilities
      from being used in user namespaces.
      
      Stephen Smalley <sds@tycho.nsa.gov> writes:
      > Originally SELinux called the cap functions directly since there was no
      > stacking support in the infrastructure and one had to manually stack a
      > secondary module internally.  inode_setxattr and inode_removexattr
      > however were special cases because the cap functions would check
      > CAP_SYS_ADMIN for any non-capability attributes in the security.*
      > namespace, and we don't want to impose that requirement on setting
      > security.selinux.  Thus, we inlined the capabilities logic into the
      > selinux hook functions and adapted it appropriately.
      
      Now that the permission checks in commoncap have evolved this
      inlining of their contents has become a problem.  So restructure
      selinux_inode_removexattr, and selinux_inode_setxattr to call
      both the corresponding cap_inode_ function and dentry_has_perm
      when the attribute is not a selinux security xattr.   This ensures
      the policies of both commoncap and selinux are enforced.
      
      This results in smack and selinux having the same basic structure
      for setxattr and removexattr.  Performing their own special permission
      checks when it is their modules xattr being written to, and deferring
      to commoncap when that is not the case.  Then finally performing their
      generic module policy on all xattr writes.
      
      This structure is fine when you only consider stacking with the
      commoncap lsm, but it becomes a problem if two lsms that don't want
      the commoncap security checks on their own attributes need to be
      stack.  This means there will need to be updates in the future as lsm
      stacking is improved, but at least now the structure between smack and
      selinux is common making the code easier to refactor.
      
      This change also has the effect that selinux_linux_setotherxattr becomes
      unnecessary so it is removed.
      
      Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities")
      Fixes: 7bbf0e052b76 ("[PATCH] selinux merge")
      Historical Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitSigned-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Reviewed-by: NSerge Hallyn <serge@hallyn.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      6b240306
  6. 29 8月, 2017 1 次提交
  7. 23 8月, 2017 1 次提交
  8. 18 8月, 2017 1 次提交
  9. 03 8月, 2017 1 次提交
    • S
      selinux: Generalize support for NNP/nosuid SELinux domain transitions · af63f419
      Stephen Smalley 提交于
      As systemd ramps up enabling NNP (NoNewPrivileges) for system services,
      it is increasingly breaking SELinux domain transitions for those services
      and their descendants.  systemd enables NNP not only for services whose
      unit files explicitly specify NoNewPrivileges=yes but also for services
      whose unit files specify any of the following options in combination with
      running without CAP_SYS_ADMIN (e.g. specifying User= or a
      CapabilityBoundingSet= without CAP_SYS_ADMIN): SystemCallFilter=,
      SystemCallArchitectures=, RestrictAddressFamilies=, RestrictNamespaces=,
      PrivateDevices=, ProtectKernelTunables=, ProtectKernelModules=,
      MemoryDenyWriteExecute=, or RestrictRealtime= as per the systemd.exec(5)
      man page.
      
      The end result is bad for the security of both SELinux-disabled and
      SELinux-enabled systems.  Packagers have to turn off these
      options in the unit files to preserve SELinux domain transitions.  For
      users who choose to disable SELinux, this means that they miss out on
      at least having the systemd-supported protections.  For users who keep
      SELinux enabled, they may still be missing out on some protections
      because it isn't necessarily guaranteed that the SELinux policy for
      that service provides the same protections in all cases.
      
      commit 7b0d0b40 ("selinux: Permit bounded transitions under
      NO_NEW_PRIVS or NOSUID.") allowed bounded transitions under NNP in
      order to support limited usage for sandboxing programs.  However,
      defining typebounds for all of the affected service domains
      is impractical to implement in policy, since typebounds requires us
      to ensure that each domain is allowed everything all of its descendant
      domains are allowed, and this has to be repeated for the entire chain
      of domain transitions.  There is no way to clone all allow rules from
      descendants to their ancestors in policy currently, and doing so would
      be undesirable even if it were practical, as it requires leaking
      permissions to objects and operations into ancestor domains that could
      weaken their own security in order to allow them to the descendants
      (e.g. if a descendant requires execmem permission, then so do all of
      its ancestors; if a descendant requires execute permission to a file,
      then so do all of its ancestors; if a descendant requires read to a
      symbolic link or temporary file, then so do all of its ancestors...).
      SELinux domains are intentionally not hierarchical / bounded in this
      manner normally, and making them so would undermine their protections
      and least privilege.
      
      We have long had a similar tension with SELinux transitions and nosuid
      mounts, albeit not as severe.  Users often have had to choose between
      retaining nosuid on a mount and allowing SELinux domain transitions on
      files within those mounts.  This likewise leads to unfortunate tradeoffs
      in security.
      
      Decouple NNP/nosuid from SELinux transitions, so that we don't have to
      make a choice between them. Introduce a nnp_nosuid_transition policy
      capability that enables transitions under NNP/nosuid to be based on
      a permission (nnp_transition for NNP; nosuid_transition for nosuid)
      between the old and new contexts in addition to the current support
      for bounded transitions.  Domain transitions can then be allowed in
      policy without requiring the parent to be a strict superset of all of
      its children.
      
      With this change, systemd unit files can be left unmodified from upstream.
      SELinux-disabled and SELinux-enabled users will benefit from retaining any
      of the systemd-provided protections.  SELinux policy will only need to
      be adapted to enable the new policy capability and to allow the
      new permissions between domain pairs as appropriate.
      
      NB: Allowing nnp_transition between two contexts opens up the potential
      for the old context to subvert the new context by installing seccomp
      filters before the execve.  Allowing nosuid_transition between two contexts
      opens up the potential for a context transition to occur on a file from
      an untrusted filesystem (e.g. removable media or remote filesystem).  Use
      with care.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      af63f419
  10. 02 8月, 2017 2 次提交
  11. 01 8月, 2017 1 次提交
  12. 26 7月, 2017 1 次提交
  13. 21 6月, 2017 1 次提交
  14. 13 6月, 2017 1 次提交
  15. 10 6月, 2017 1 次提交
    • S
      security/selinux: allow security_sb_clone_mnt_opts to enable/disable native labeling behavior · 0b4d3452
      Scott Mayhew 提交于
      When an NFSv4 client performs a mount operation, it first mounts the
      NFSv4 root and then does path walk to the exported path and performs a
      submount on that, cloning the security mount options from the root's
      superblock to the submount's superblock in the process.
      
      Unless the NFS server has an explicit fsid=0 export with the
      "security_label" option, the NFSv4 root superblock will not have
      SBLABEL_MNT set, and neither will the submount superblock after cloning
      the security mount options.  As a result, setxattr's of security labels
      over NFSv4.2 will fail.  In a similar fashion, NFSv4.2 mounts mounted
      with the context= mount option will not show the correct labels because
      the nfs_server->caps flags of the cloned superblock will still have
      NFS_CAP_SECURITY_LABEL set.
      
      Allowing the NFSv4 client to enable or disable SECURITY_LSM_NATIVE_LABELS
      behavior will ensure that the SBLABEL_MNT flag has the correct value
      when the client traverses from an exported path without the
      "security_label" option to one with the "security_label" option and
      vice versa.  Similarly, checking to see if SECURITY_LSM_NATIVE_LABELS is
      set upon return from security_sb_clone_mnt_opts() and clearing
      NFS_CAP_SECURITY_LABEL if necessary will allow the correct labels to
      be displayed for NFSv4.2 mounts mounted with the context= mount option.
      
      Resolves: https://github.com/SELinuxProject/selinux-kernel/issues/35Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Reviewed-by: NStephen Smalley <sds@tycho.nsa.gov>
      Tested-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      0b4d3452
  16. 02 6月, 2017 1 次提交
  17. 24 5月, 2017 5 次提交
  18. 23 5月, 2017 5 次提交
    • M
      selinux: Remove redundant check for unknown labeling behavior · 270e8573
      Matthias Kaehlcke 提交于
      The check is already performed in ocontext_read() when the policy is
      loaded. Removing the array also fixes the following warning when
      building with clang:
      
      security/selinux/hooks.c:338:20: error: variable 'labeling_behaviors'
          is not needed and will not be emitted
          [-Werror,-Wunneeded-internal-declaration]
      Signed-off-by: NMatthias Kaehlcke <mka@chromium.org>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      270e8573
    • S
      selinux: do not check open permission on sockets · ccb54478
      Stephen Smalley 提交于
      open permission is currently only defined for files in the kernel
      (COMMON_FILE_PERMS rather than COMMON_FILE_SOCK_PERMS). Construction of
      an artificial test case that tries to open a socket via /proc/pid/fd will
      generate a recvfrom avc denial because recvfrom and open happen to map to
      the same permission bit in socket vs file classes.
      
      open of a socket via /proc/pid/fd is not supported by the kernel regardless
      and will ultimately return ENXIO. But we hit the permission check first and
      can thus produce these odd/misleading denials.  Omit the open check when
      operating on a socket.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      ccb54478
    • S
      selinux: add a map permission check for mmap · 3ba4bf5f
      Stephen Smalley 提交于
      Add a map permission check on mmap so that we can distinguish memory mapped
      access (since it has different implications for revocation). When a file
      is opened and then read or written via syscalls like read(2)/write(2),
      we revalidate access on each read/write operation via
      selinux_file_permission() and therefore can revoke access if the
      process context, the file context, or the policy changes in such a
      manner that access is no longer allowed. When a file is opened and then
      memory mapped via mmap(2) and then subsequently read or written directly
      in memory, we presently have no way to revalidate or revoke access.
      The purpose of a separate map permission check on mmap(2) is to permit
      policy to prohibit memory mapping of specific files for which we need
      to ensure that every access is revalidated, particularly useful for
      scenarios where we expect the file to be relabeled at runtime in order
      to reflect state changes (e.g. cross-domain solution, assured pipeline
      without data copying).
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      3ba4bf5f
    • S
      selinux: only invoke capabilities and selinux for CAP_MAC_ADMIN checks · db59000a
      Stephen Smalley 提交于
      SELinux uses CAP_MAC_ADMIN to control the ability to get or set a raw,
      uninterpreted security context unknown to the currently loaded security
      policy. When performing these checks, we only want to perform a base
      capabilities check and a SELinux permission check.  If any other
      modules that implement a capable hook are stacked with SELinux, we do
      not want to require them to also have to authorize CAP_MAC_ADMIN,
      since it may have different implications for their security model.
      Rework the CAP_MAC_ADMIN checks within SELinux to only invoke the
      capabilities module and the SELinux permission checking.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      db59000a
    • T
      selinux: Use task_alloc hook rather than task_create hook · a79be238
      Tetsuo Handa 提交于
      This patch is a preparation for getting rid of task_create hook because
      task_alloc hook which can do what task_create hook can do was revived.
      
      Creating a new thread is unlikely prohibited by security policy, for
      fork()/execve()/exit() is fundamental of how processes are managed in
      Unix. If a program is known to create a new thread, it is likely that
      permission to create a new thread is given to that program. Therefore,
      a situation where security_task_create() returns an error is likely that
      the program was exploited and lost control. Even if SELinux failed to
      check permission to create a thread at security_task_create(), SELinux
      can later check it at security_task_alloc(). Since the new thread is not
      yet visible from the rest of the system, nobody can do bad things using
      the new thread. What we waste will be limited to some initialization
      steps such as dup_task_struct(), copy_creds() and audit_alloc() in
      copy_process(). We can tolerate these overhead for unlikely situation.
      
      Therefore, this patch changes SELinux to use task_alloc hook rather than
      task_create hook so that we can remove task_create hook.
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      a79be238
  19. 11 3月, 2017 1 次提交
    • A
      selinux: check for address length in selinux_socket_bind() · e2f586bd
      Alexander Potapenko 提交于
      KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
      uninitialized memory in selinux_socket_bind():
      
      ==================================================================
      BUG: KMSAN: use of unitialized memory
      inter: 0
      CPU: 3 PID: 1074 Comm: packet2 Tainted: G    B           4.8.0-rc6+ #1916
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       0000000000000000 ffff8800882ffb08 ffffffff825759c8 ffff8800882ffa48
       ffffffff818bf551 ffffffff85bab870 0000000000000092 ffffffff85bab550
       0000000000000000 0000000000000092 00000000bb0009bb 0000000000000002
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff825759c8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
       [<ffffffff818bdee6>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1008
       [<ffffffff818bf0fb>] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424
       [<ffffffff822dae71>] selinux_socket_bind+0xf41/0x1080 security/selinux/hooks.c:4288
       [<ffffffff8229357c>] security_socket_bind+0x1ec/0x240 security/security.c:1240
       [<ffffffff84265d98>] SYSC_bind+0x358/0x5f0 net/socket.c:1366
       [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
       [<ffffffff8518217c>] entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.o:?
      chained origin: 00000000ba6009bb
       [<ffffffff810bb7a7>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67
       [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
       [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:337
       [<ffffffff818bd2b8>] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:530
       [<ffffffff818bf033>] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380
       [<ffffffff84265b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
       [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
       [<ffffffff8518217c>] return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.o:?
      origin description: ----address@SYSC_bind (origin=00000000b8c00900)
      ==================================================================
      
      (the line numbers are relative to 4.8-rc6, but the bug persists upstream)
      
      , when I run the following program as root:
      
      =======================================================
        #include <string.h>
        #include <sys/socket.h>
        #include <netinet/in.h>
      
        int main(int argc, char *argv[]) {
          struct sockaddr addr;
          int size = 0;
          if (argc > 1) {
            size = atoi(argv[1]);
          }
          memset(&addr, 0, sizeof(addr));
          int fd = socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP);
          bind(fd, &addr, size);
          return 0;
        }
      =======================================================
      
      (for different values of |size| other error reports are printed).
      
      This happens because bind() unconditionally copies |size| bytes of
      |addr| to the kernel, leaving the rest uninitialized. Then
      security_socket_bind() reads the IP address bytes, including the
      uninitialized ones, to determine the port, or e.g. pass them further to
      sel_netnode_find(), which uses them to calculate a hash.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      [PM: fixed some whitespace damage]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      e2f586bd
  20. 06 3月, 2017 3 次提交
    • J
      security: mark LSM hooks as __ro_after_init · ca97d939
      James Morris 提交于
      Mark all of the registration hooks as __ro_after_init (via the
      __lsm_ro_after_init macro).
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NKees Cook <keescook@chromium.org>
      ca97d939
    • S
      selinux: fix kernel BUG on prlimit(..., NULL, NULL) · 84e6885e
      Stephen Smalley 提交于
      commit 79bcf325e6b32b3c ("prlimit,security,selinux: add a security hook
      for prlimit") introduced a security hook for prlimit() and implemented it
      for SELinux.  However, if prlimit() is called with NULL arguments for both
      the new limit and the old limit, then the hook is called with 0 for the
      read/write flags, since the prlimit() will neither read nor write the
      process' limits.  This would in turn lead to calling avc_has_perm() with 0
      for the requested permissions, which triggers a BUG_ON() in
      avc_has_perm_noaudit() since the kernel should never be invoking
      avc_has_perm() with no permissions.  Fix this in the SELinux hook by
      returning immediately if the flags are 0.  Arguably prlimit64() itself
      ought to return immediately if both old_rlim and new_rlim are NULL since
      it is effectively a no-op in that case.
      
      Reported by the lkp-robot based on trinity testing.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      84e6885e
    • S
      prlimit,security,selinux: add a security hook for prlimit · 791ec491
      Stephen Smalley 提交于
      When SELinux was first added to the kernel, a process could only get
      and set its own resource limits via getrlimit(2) and setrlimit(2), so no
      MAC checks were required for those operations, and thus no security hooks
      were defined for them. Later, SELinux introduced a hook for setlimit(2)
      with a check if the hard limit was being changed in order to be able to
      rely on the hard limit value as a safe reset point upon context
      transitions.
      
      Later on, when prlimit(2) was added to the kernel with the ability to get
      or set resource limits (hard or soft) of another process, LSM/SELinux was
      not updated other than to pass the target process to the setrlimit hook.
      This resulted in incomplete control over both getting and setting the
      resource limits of another process.
      
      Add a new security_task_prlimit() hook to the check_prlimit_permission()
      function to provide complete mediation.  The hook is only called when
      acting on another task, and only if the existing DAC/capability checks
      would allow access.  Pass flags down to the hook to indicate whether the
      prlimit(2) call will read, write, or both read and write the resource
      limits of the target process.
      
      The existing security_task_setrlimit() hook is left alone; it continues
      to serve a purpose in supporting the ability to make decisions based on
      the old and/or new resource limit values when setting limits.  This
      is consistent with the DAC/capability logic, where
      check_prlimit_permission() performs generic DAC/capability checks for
      acting on another task, while do_prlimit() performs a capability check
      based on a comparison of the old and new resource limits.  Fix the
      inline documentation for the hook to match the code.
      
      Implement the new hook for SELinux.  For setting resource limits, we
      reuse the existing setrlimit permission.  Note that this does overload
      the setrlimit permission to mean the ability to set the resource limit
      (soft or hard) of another process or the ability to change one's own
      hard limit.  For getting resource limits, a new getrlimit permission
      is defined.  This was not originally defined since getrlimit(2) could
      only be used to obtain a process' own limits.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      791ec491
  21. 02 3月, 2017 3 次提交
  22. 08 2月, 2017 3 次提交
    • S
      selinux: fix off-by-one in setprocattr · 0c461cb7
      Stephen Smalley 提交于
      SELinux tries to support setting/clearing of /proc/pid/attr attributes
      from the shell by ignoring terminating newlines and treating an
      attribute value that begins with a NUL or newline as an attempt to
      clear the attribute.  However, the test for clearing attributes has
      always been wrong; it has an off-by-one error, and this could further
      lead to reading past the end of the allocated buffer since commit
      bb646cdb ("proc_pid_attr_write():
      switch to memdup_user()").  Fix the off-by-one error.
      
      Even with this fix, setting and clearing /proc/pid/attr attributes
      from the shell is not straightforward since the interface does not
      support multiple write() calls (so shells that write the value and
      newline separately will set and then immediately clear the attribute,
      requiring use of echo -n to set the attribute), whereas trying to use
      echo -n "" to clear the attribute causes the shell to skip the
      write() call altogether since POSIX says that a zero-length write
      causes no side effects. Thus, one must use echo -n to set and echo
      without -n to clear, as in the following example:
      $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      unconfined_u:object_r:user_home_t:s0
      $ echo "" > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      
      Note the use of /proc/$$ rather than /proc/self, as otherwise
      the cat command will read its own attribute value, not that of the shell.
      
      There are no users of this facility to my knowledge; possibly we
      should just get rid of it.
      
      UPDATE: Upon further investigation it appears that a local process
      with the process:setfscreate permission can cause a kernel panic as a
      result of this bug.  This patch fixes CVE-2017-2618.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      [PM: added the update about CVE-2017-2618 to the commit description]
      Cc: stable@vger.kernel.org # 3.5: d6ea83ecSigned-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      0c461cb7
    • A
      selinux: allow changing labels for cgroupfs · 1ea0ce40
      Antonio Murdaca 提交于
      This patch allows changing labels for cgroup mounts. Previously, running
      chcon on cgroupfs would throw an "Operation not supported". This patch
      specifically whitelist cgroupfs.
      
      The patch could also allow containers to write only to the systemd cgroup
      for instance, while the other cgroups are kept with cgroup_t label.
      Signed-off-by: NAntonio Murdaca <runcom@redhat.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      1ea0ce40
    • S
      selinux: fix off-by-one in setprocattr · a050a570
      Stephen Smalley 提交于
      SELinux tries to support setting/clearing of /proc/pid/attr attributes
      from the shell by ignoring terminating newlines and treating an
      attribute value that begins with a NUL or newline as an attempt to
      clear the attribute.  However, the test for clearing attributes has
      always been wrong; it has an off-by-one error, and this could further
      lead to reading past the end of the allocated buffer since commit
      bb646cdb ("proc_pid_attr_write():
      switch to memdup_user()").  Fix the off-by-one error.
      
      Even with this fix, setting and clearing /proc/pid/attr attributes
      from the shell is not straightforward since the interface does not
      support multiple write() calls (so shells that write the value and
      newline separately will set and then immediately clear the attribute,
      requiring use of echo -n to set the attribute), whereas trying to use
      echo -n "" to clear the attribute causes the shell to skip the
      write() call altogether since POSIX says that a zero-length write
      causes no side effects. Thus, one must use echo -n to set and echo
      without -n to clear, as in the following example:
      $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      unconfined_u:object_r:user_home_t:s0
      $ echo "" > /proc/$$/attr/fscreate
      $ cat /proc/$$/attr/fscreate
      
      Note the use of /proc/$$ rather than /proc/self, as otherwise
      the cat command will read its own attribute value, not that of the shell.
      
      There are no users of this facility to my knowledge; possibly we
      should just get rid of it.
      
      UPDATE: Upon further investigation it appears that a local process
      with the process:setfscreate permission can cause a kernel panic as a
      result of this bug.  This patch fixes CVE-2017-2618.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      [PM: added the update about CVE-2017-2618 to the commit description]
      Cc: stable@vger.kernel.org # 3.5: d6ea83ecSigned-off-by: NPaul Moore <paul@paul-moore.com>
      a050a570
  23. 25 1月, 2017 1 次提交
    • K
      Introduce a sysctl that modifies the value of PROT_SOCK. · 4548b683
      Krister Johansen 提交于
      Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
      that denotes the first unprivileged inet port in the namespace.  To
      disable all privileged ports set this to zero.  It also checks for
      overlap with the local port range.  The privileged and local range may
      not overlap.
      
      The use case for this change is to allow containerized processes to bind
      to priviliged ports, but prevent them from ever being allowed to modify
      their container's network configuration.  The latter is accomplished by
      ensuring that the network namespace is not a child of the user
      namespace.  This modification was needed to allow the container manager
      to disable a namespace's priviliged port restrictions without exposing
      control of the network namespace to processes in the user namespace.
      Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4548b683