1. 02 9月, 2017 1 次提交
    • S
      Introduce v3 namespaced file capabilities · 8db6c34f
      Serge E. Hallyn 提交于
      Root in a non-initial user ns cannot be trusted to write a traditional
      security.capability xattr.  If it were allowed to do so, then any
      unprivileged user on the host could map his own uid to root in a private
      namespace, write the xattr, and execute the file with privilege on the
      host.
      
      However supporting file capabilities in a user namespace is very
      desirable.  Not doing so means that any programs designed to run with
      limited privilege must continue to support other methods of gaining and
      dropping privilege.  For instance a program installer must detect
      whether file capabilities can be assigned, and assign them if so but set
      setuid-root otherwise.  The program in turn must know how to drop
      partial capabilities, and do so only if setuid-root.
      
      This patch introduces v3 of the security.capability xattr.  It builds a
      vfs_ns_cap_data struct by appending a uid_t rootid to struct
      vfs_cap_data.  This is the absolute uid_t (that is, the uid_t in user
      namespace which mounted the filesystem, usually init_user_ns) of the
      root id in whose namespaces the file capabilities may take effect.
      
      When a task asks to write a v2 security.capability xattr, if it is
      privileged with respect to the userns which mounted the filesystem, then
      nothing should change.  Otherwise, the kernel will transparently rewrite
      the xattr as a v3 with the appropriate rootid.  This is done during the
      execution of setxattr() to catch user-space-initiated capability writes.
      Subsequently, any task executing the file which has the noted kuid as
      its root uid, or which is in a descendent user_ns of such a user_ns,
      will run the file with capabilities.
      
      Similarly when asking to read file capabilities, a v3 capability will
      be presented as v2 if it applies to the caller's namespace.
      
      If a task writes a v3 security.capability, then it can provide a uid for
      the xattr so long as the uid is valid in its own user namespace, and it
      is privileged with CAP_SETFCAP over its namespace.  The kernel will
      translate that rootid to an absolute uid, and write that to disk.  After
      this, a task in the writer's namespace will not be able to use those
      capabilities (unless rootid was 0), but a task in a namespace where the
      given uid is root will.
      
      Only a single security.capability xattr may exist at a time for a given
      file.  A task may overwrite an existing xattr so long as it is
      privileged over the inode.  Note this is a departure from previous
      semantics, which required privilege to remove a security.capability
      xattr.  This check can be re-added if deemed useful.
      
      This allows a simple setxattr to work, allows tar/untar to work, and
      allows us to tar in one namespace and untar in another while preserving
      the capability, without risking leaking privilege into a parent
      namespace.
      
      Example using tar:
      
       $ cp /bin/sleep sleepx
       $ mkdir b1 b2
       $ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1
       $ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2
       $ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx
       $ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar
       $ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx
         b2/sleepx = cap_sys_admin+ep
       # /opt/ltp/testcases/bin/getv3xattr b2/sleepx
         v3 xattr, rootid is 100001
      
      A patch to linux-test-project adding a new set of tests for this
      functionality is in the nsfscaps branch at github.com/hallyn/ltp
      
      Changelog:
         Nov 02 2016: fix invalid check at refuse_fcap_overwrite()
         Nov 07 2016: convert rootid from and to fs user_ns
         (From ebiederm: mar 28 2017)
           commoncap.c: fix typos - s/v4/v3
           get_vfs_caps_from_disk: clarify the fs_ns root access check
           nsfscaps: change the code split for cap_inode_setxattr()
         Apr 09 2017:
             don't return v3 cap for caps owned by current root.
            return a v2 cap for a true v2 cap in non-init ns
         Apr 18 2017:
            . Change the flow of fscap writing to support s_user_ns writing.
            . Remove refuse_fcap_overwrite().  The value of the previous
              xattr doesn't matter.
         Apr 24 2017:
            . incorporate Eric's incremental diff
            . move cap_convert_nscap to setxattr and simplify its usage
         May 8, 2017:
            . fix leaking dentry refcount in cap_inode_getsecurity
      Signed-off-by: NSerge Hallyn <serge@hallyn.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      8db6c34f
  2. 10 6月, 2017 1 次提交
    • S
      security/selinux: allow security_sb_clone_mnt_opts to enable/disable native labeling behavior · 0b4d3452
      Scott Mayhew 提交于
      When an NFSv4 client performs a mount operation, it first mounts the
      NFSv4 root and then does path walk to the exported path and performs a
      submount on that, cloning the security mount options from the root's
      superblock to the submount's superblock in the process.
      
      Unless the NFS server has an explicit fsid=0 export with the
      "security_label" option, the NFSv4 root superblock will not have
      SBLABEL_MNT set, and neither will the submount superblock after cloning
      the security mount options.  As a result, setxattr's of security labels
      over NFSv4.2 will fail.  In a similar fashion, NFSv4.2 mounts mounted
      with the context= mount option will not show the correct labels because
      the nfs_server->caps flags of the cloned superblock will still have
      NFS_CAP_SECURITY_LABEL set.
      
      Allowing the NFSv4 client to enable or disable SECURITY_LSM_NATIVE_LABELS
      behavior will ensure that the SBLABEL_MNT flag has the correct value
      when the client traverses from an exported path without the
      "security_label" option to one with the "security_label" option and
      vice versa.  Similarly, checking to see if SECURITY_LSM_NATIVE_LABELS is
      set upon return from security_sb_clone_mnt_opts() and clearing
      NFS_CAP_SECURITY_LABEL if necessary will allow the correct labels to
      be displayed for NFSv4.2 mounts mounted with the context= mount option.
      
      Resolves: https://github.com/SELinuxProject/selinux-kernel/issues/35Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Reviewed-by: NStephen Smalley <sds@tycho.nsa.gov>
      Tested-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      0b4d3452
  3. 09 6月, 2017 1 次提交
  4. 24 5月, 2017 3 次提交
    • D
      IB/core: Enforce security on management datagrams · 47a2b338
      Daniel Jurgens 提交于
      Allocate and free a security context when creating and destroying a MAD
      agent.  This context is used for controlling access to PKeys and sending
      and receiving SMPs.
      
      When sending or receiving a MAD check that the agent has permission to
      access the PKey for the Subnet Prefix of the port.
      
      During MAD and snoop agent registration for SMI QPs check that the
      calling process has permission to access the manage the subnet  and
      register a callback with the LSM to be notified of policy changes. When
      notificaiton of a policy change occurs recheck permission and set a flag
      indicating sending and receiving SMPs is allowed.
      
      When sending and receiving MADs check that the agent has access to the
      SMI if it's on an SMI QP.  Because security policy can change it's
      possible permission was allowed when creating the agent, but no longer
      is.
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Acked-by: NDoug Ledford <dledford@redhat.com>
      [PM: remove the LSM hook init code]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      47a2b338
    • D
      selinux lsm IB/core: Implement LSM notification system · 8f408ab6
      Daniel Jurgens 提交于
      Add a generic notificaiton mechanism in the LSM. Interested consumers
      can register a callback with the LSM and security modules can produce
      events.
      
      Because access to Infiniband QPs are enforced in the setup phase of a
      connection security should be enforced again if the policy changes.
      Register infiniband devices for policy change notification and check all
      QPs on that device when the notification is received.
      
      Add a call to the notification mechanism from SELinux when the AVC
      cache changes or setenforce is cleared.
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Acked-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      8f408ab6
    • D
      IB/core: Enforce PKey security on QPs · d291f1a6
      Daniel Jurgens 提交于
      Add new LSM hooks to allocate and free security contexts and check for
      permission to access a PKey.
      
      Allocate and free a security context when creating and destroying a QP.
      This context is used for controlling access to PKeys.
      
      When a request is made to modify a QP that changes the port, PKey index,
      or alternate path, check that the QP has permission for the PKey in the
      PKey table index on the subnet prefix of the port. If the QP is shared
      make sure all handles to the QP also have access.
      
      Store which port and PKey index a QP is using. After the reset to init
      transition the user can modify the port, PKey index and alternate path
      independently. So port and PKey settings changes can be a merge of the
      previous settings and the new ones.
      
      In order to maintain access control if there are PKey table or subnet
      prefix change keep a list of all QPs are using each PKey index on
      each port. If a change occurs all QPs using that device and port must
      have access enforced for the new cache settings.
      
      These changes add a transaction to the QP modify process. Association
      with the old port and PKey index must be maintained if the modify fails,
      and must be removed if it succeeds. Association with the new port and
      PKey index must be established prior to the modify and removed if the
      modify fails.
      
      1. When a QP is modified to a particular Port, PKey index or alternate
         path insert that QP into the appropriate lists.
      
      2. Check permission to access the new settings.
      
      3. If step 2 grants access attempt to modify the QP.
      
      4a. If steps 2 and 3 succeed remove any prior associations.
      
      4b. If ether fails remove the new setting associations.
      
      If a PKey table or subnet prefix changes walk the list of QPs and
      check that they have permission. If not send the QP to the error state
      and raise a fatal error event. If it's a shared QP make sure all the
      QPs that share the real_qp have permission as well. If the QP that
      owns a security structure is denied access the security structure is
      marked as such and the QP is added to an error_list. Once the moving
      the QP to error is complete the security structure mark is cleared.
      
      Maintaining the lists correctly turns QP destroy into a transaction.
      The hardware driver for the device frees the ib_qp structure, so while
      the destroy is in progress the ib_qp pointer in the ib_qp_security
      struct is undefined. When the destroy process begins the ib_qp_security
      structure is marked as destroying. This prevents any action from being
      taken on the QP pointer. After the QP is destroyed successfully it
      could still listed on an error_list wait for it to be processed by that
      flow before cleaning up the structure.
      
      If the destroy fails the QPs port and PKey settings are reinserted into
      the appropriate lists, the destroying flag is cleared, and access control
      is enforced, in case there were any cache changes during the destroy
      flow.
      
      To keep the security changes isolated a new file is used to hold security
      related functionality.
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Acked-by: NDoug Ledford <dledford@redhat.com>
      [PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      d291f1a6
  5. 28 3月, 2017 1 次提交
    • T
      LSM: Revive security_task_alloc() hook and per "struct task_struct" security blob. · e4e55b47
      Tetsuo Handa 提交于
      We switched from "struct task_struct"->security to "struct cred"->security
      in Linux 2.6.29. But not all LSM modules were happy with that change.
      TOMOYO LSM module is an example which want to use per "struct task_struct"
      security blob, for TOMOYO's security context is defined based on "struct
      task_struct" rather than "struct cred". AppArmor LSM module is another
      example which want to use it, for AppArmor is currently abusing the cred
      a little bit to store the change_hat and setexeccon info. Although
      security_task_free() hook was revived in Linux 3.4 because Yama LSM module
      wanted to release per "struct task_struct" security blob,
      security_task_alloc() hook and "struct task_struct"->security field were
      not revived. Nowadays, we are getting proposals of lightweight LSM modules
      which want to use per "struct task_struct" security blob.
      
      We are already allowing multiple concurrent LSM modules (up to one fully
      armored module which uses "struct cred"->security field or exclusive hooks
      like security_xfrm_state_pol_flow_match(), plus unlimited number of
      lightweight modules which do not use "struct cred"->security nor exclusive
      hooks) as long as they are built into the kernel. But this patch does not
      implement variable length "struct task_struct"->security field which will
      become needed when multiple LSM modules want to use "struct task_struct"->
      security field. Although it won't be difficult to implement variable length
      "struct task_struct"->security field, let's think about it after we merged
      this patch.
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: NJohn Johansen <john.johansen@canonical.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Tested-by: NDjalal Harouni <tixxdz@gmail.com>
      Acked-by: NJosé Bollo <jobol@nonadev.net>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: José Bollo <jobol@nonadev.net>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      e4e55b47
  6. 06 3月, 2017 1 次提交
    • S
      prlimit,security,selinux: add a security hook for prlimit · 791ec491
      Stephen Smalley 提交于
      When SELinux was first added to the kernel, a process could only get
      and set its own resource limits via getrlimit(2) and setrlimit(2), so no
      MAC checks were required for those operations, and thus no security hooks
      were defined for them. Later, SELinux introduced a hook for setlimit(2)
      with a check if the hard limit was being changed in order to be able to
      rely on the hard limit value as a safe reset point upon context
      transitions.
      
      Later on, when prlimit(2) was added to the kernel with the ability to get
      or set resource limits (hard or soft) of another process, LSM/SELinux was
      not updated other than to pass the target process to the setrlimit hook.
      This resulted in incomplete control over both getting and setting the
      resource limits of another process.
      
      Add a new security_task_prlimit() hook to the check_prlimit_permission()
      function to provide complete mediation.  The hook is only called when
      acting on another task, and only if the existing DAC/capability checks
      would allow access.  Pass flags down to the hook to indicate whether the
      prlimit(2) call will read, write, or both read and write the resource
      limits of the target process.
      
      The existing security_task_setrlimit() hook is left alone; it continues
      to serve a purpose in supporting the ability to make decisions based on
      the old and/or new resource limit values when setting limits.  This
      is consistent with the DAC/capability logic, where
      check_prlimit_permission() performs generic DAC/capability checks for
      acting on another task, while do_prlimit() performs a capability check
      based on a comparison of the old and new resource limits.  Fix the
      inline documentation for the hook to match the code.
      
      Implement the new hook for SELinux.  For setting resource limits, we
      reuse the existing setrlimit permission.  Note that this does overload
      the setrlimit permission to mean the ability to set the resource limit
      (soft or hard) of another process or the ability to change one's own
      hard limit.  For getting resource limits, a new getrlimit permission
      is defined.  This was not originally defined since getrlimit(2) could
      only be used to obtain a process' own limits.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      791ec491
  7. 24 1月, 2017 1 次提交
  8. 13 1月, 2017 1 次提交
  9. 09 1月, 2017 1 次提交
  10. 09 8月, 2016 4 次提交
  11. 21 7月, 2016 1 次提交
  12. 23 4月, 2016 1 次提交
    • B
      security: Introduce security_settime64() · 457db29b
      Baolin Wang 提交于
      security_settime() uses a timespec, which is not year 2038 safe
      on 32bit systems. Thus this patch introduces the security_settime64()
      function with timespec64 type. We also convert the cap_settime() helper
      function to use the 64bit types.
      
      This patch then moves security_settime() to the header file as an
      inline helper function so that existing users can be iteratively
      converted.
      
      None of the existing hooks is using the timespec argument and therefor
      the patch is not making any functional changes.
      
      Cc: Serge Hallyn <serge.hallyn@canonical.com>,
      Cc: James Morris <james.l.morris@oracle.com>,
      Cc: "Serge E. Hallyn" <serge@hallyn.com>,
      Cc: Paul Moore <pmoore@redhat.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Reviewed-by: NJames Morris <james.l.morris@oracle.com>
      Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
      [jstultz: Reworded commit message]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      457db29b
  13. 28 3月, 2016 9 次提交
  14. 21 2月, 2016 4 次提交
  15. 19 2月, 2016 2 次提交
  16. 25 12月, 2015 3 次提交
  17. 21 9月, 2015 1 次提交
  18. 12 5月, 2015 3 次提交
  19. 11 5月, 2015 1 次提交
    • N
      security: make inode_follow_link RCU-walk aware · bda0be7a
      NeilBrown 提交于
      inode_follow_link now takes an inode and rcu flag as well as the
      dentry.
      
      inode is used in preference to d_backing_inode(dentry), particularly
      in RCU-walk mode.
      
      selinux_inode_follow_link() gets dentry_has_perm() and
      inode_has_perm() open-coded into it so that it can call
      avc_has_perm_flags() in way that is safe if LOOKUP_RCU is set.
      
      Calling avc_has_perm_flags() with rcu_read_lock() held means
      that when avc_has_perm_noaudit calls avc_compute_av(), the attempt
      to rcu_read_unlock() before calling security_compute_av() will not
      actually drop the RCU read-lock.
      
      However as security_compute_av() is completely in a read_lock()ed
      region, it should be safe with the RCU read-lock held.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bda0be7a