提交 · 0302e28dee643932ee7b3c112ebccdbb9f8ec32c · openeuler / raspberrypi-kernel

11 3月, 2017 1 次提交

selinux: check for address length in selinux_socket_bind() · e2f586bd

由 Alexander Potapenko 提交于 3月 06, 2017

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in selinux_socket_bind():

==================================================================
BUG: KMSAN: use of unitialized memory
inter: 0
CPU: 3 PID: 1074 Comm: packet2 Tainted: G    B           4.8.0-rc6+ #1916
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 0000000000000000 ffff8800882ffb08 ffffffff825759c8 ffff8800882ffa48
 ffffffff818bf551 ffffffff85bab870 0000000000000092 ffffffff85bab550
 0000000000000000 0000000000000092 00000000bb0009bb 0000000000000002
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff825759c8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [<ffffffff818bdee6>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1008
 [<ffffffff818bf0fb>] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424
 [<ffffffff822dae71>] selinux_socket_bind+0xf41/0x1080 security/selinux/hooks.c:4288
 [<ffffffff8229357c>] security_socket_bind+0x1ec/0x240 security/security.c:1240
 [<ffffffff84265d98>] SYSC_bind+0x358/0x5f0 net/socket.c:1366
 [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
 [<ffffffff8518217c>] entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.o:?
chained origin: 00000000ba6009bb
 [<ffffffff810bb7a7>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67
 [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:337
 [<ffffffff818bd2b8>] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:530
 [<ffffffff818bf033>] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380
 [<ffffffff84265b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
 [<ffffffff8518217c>] return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000b8c00900)
==================================================================

(the line numbers are relative to 4.8-rc6, but the bug persists upstream)

, when I run the following program as root:

=======================================================
  #include <string.h>
  #include <sys/socket.h>
  #include <netinet/in.h>

  int main(int argc, char *argv[]) {
    struct sockaddr addr;
    int size = 0;
    if (argc > 1) {
      size = atoi(argv[1]);
    }
    memset(&addr, 0, sizeof(addr));
    int fd = socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP);
    bind(fd, &addr, size);
    return 0;
  }
=======================================================

(for different values of |size| other error reports are printed).

This happens because bind() unconditionally copies |size| bytes of
|addr| to the kernel, leaving the rest uninitialized. Then
security_socket_bind() reads the IP address bytes, including the
uninitialized ones, to determine the port, or e.g. pass them further to
sel_netnode_find(), which uses them to calculate a hash.
Signed-off-by: NAlexander Potapenko <glider@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
[PM: fixed some whitespace damage]
Signed-off-by: NPaul Moore <paul@paul-moore.com>

e2f586bd

06 3月, 2017 3 次提交

security: mark LSM hooks as __ro_after_init · ca97d939

由 James Morris 提交于 2月 15, 2017

Mark all of the registration hooks as __ro_after_init (via the
__lsm_ro_after_init macro).
Signed-off-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NKees Cook <keescook@chromium.org>

ca97d939

selinux: fix kernel BUG on prlimit(..., NULL, NULL) · 84e6885e

由 Stephen Smalley 提交于 2月 28, 2017

commit 79bcf325e6b32b3c ("prlimit,security,selinux: add a security hook
for prlimit") introduced a security hook for prlimit() and implemented it
for SELinux. However, if prlimit() is called with NULL arguments for both
the new limit and the old limit, then the hook is called with 0 for the
read/write flags, since the prlimit() will neither read nor write the
process' limits. This would in turn lead to calling avc_has_perm() with 0
for the requested permissions, which triggers a BUG_ON() in
avc_has_perm_noaudit() since the kernel should never be invoking
avc_has_perm() with no permissions. Fix this in the SELinux hook by
returning immediately if the flags are 0. Arguably prlimit64() itself
ought to return immediately if both old_rlim and new_rlim are NULL since
it is effectively a no-op in that case.

Reported by the lkp-robot based on trinity testing.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

84e6885e

prlimit,security,selinux: add a security hook for prlimit · 791ec491

由 Stephen Smalley 提交于 2月 17, 2017

When SELinux was first added to the kernel, a process could only get
and set its own resource limits via getrlimit(2) and setrlimit(2), so no
MAC checks were required for those operations, and thus no security hooks
were defined for them. Later, SELinux introduced a hook for setlimit(2)
with a check if the hard limit was being changed in order to be able to
rely on the hard limit value as a safe reset point upon context
transitions.

Later on, when prlimit(2) was added to the kernel with the ability to get
or set resource limits (hard or soft) of another process, LSM/SELinux was
not updated other than to pass the target process to the setrlimit hook.
This resulted in incomplete control over both getting and setting the
resource limits of another process.

Add a new security_task_prlimit() hook to the check_prlimit_permission()
function to provide complete mediation. The hook is only called when
acting on another task, and only if the existing DAC/capability checks
would allow access. Pass flags down to the hook to indicate whether the
prlimit(2) call will read, write, or both read and write the resource
limits of the target process.

The existing security_task_setrlimit() hook is left alone; it continues
to serve a purpose in supporting the ability to make decisions based on
the old and/or new resource limit values when setting limits. This
is consistent with the DAC/capability logic, where
check_prlimit_permission() performs generic DAC/capability checks for
acting on another task, while do_prlimit() performs a capability check
based on a comparison of the old and new resource limits. Fix the
inline documentation for the hook to match the code.

Implement the new hook for SELinux. For setting resource limits, we
reuse the existing setrlimit permission. Note that this does overload
the setrlimit permission to mean the ability to set the resource limit
(soft or hard) of another process or the ability to change one's own
hard limit. For getting resource limits, a new getrlimit permission
is defined. This was not originally defined since getrlimit(2) could
only be used to obtain a process' own limits.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

791ec491

02 3月, 2017 3 次提交

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> · 29930025

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/task.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

29930025

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> · 3f07c014

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

3f07c014

selinux: wrap cgroup seclabel support with its own policy capability · 2651225b

由 Stephen Smalley 提交于 2月 28, 2017

commit 1ea0ce40 ("selinux: allow
changing labels for cgroupfs") broke the Android init program,
which looks up security contexts whenever creating directories
and attempts to assign them via setfscreatecon().
When creating subdirectories in cgroup mounts, this would previously
be ignored since cgroup did not support userspace setting of security
contexts.  However, after the commit, SELinux would attempt to honor
the requested context on cgroup directories and fail due to permission
denial.  Avoid breaking existing userspace/policy by wrapping this change
with a conditional on a new cgroup_seclabel policy capability.  This
preserves existing behavior until/unless a new policy explicitly enables
this capability.
Reported-by: NJohn Stultz <john.stultz@linaro.org>
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

2651225b

08 2月, 2017 3 次提交

selinux: fix off-by-one in setprocattr · 0c461cb7

由 Stephen Smalley 提交于 1月 31, 2017

SELinux tries to support setting/clearing of /proc/pid/attr attributes
from the shell by ignoring terminating newlines and treating an
attribute value that begins with a NUL or newline as an attempt to
clear the attribute.  However, the test for clearing attributes has
always been wrong; it has an off-by-one error, and this could further
lead to reading past the end of the allocated buffer since commit
bb646cdb ("proc_pid_attr_write():
switch to memdup_user()").  Fix the off-by-one error.

Even with this fix, setting and clearing /proc/pid/attr attributes
from the shell is not straightforward since the interface does not
support multiple write() calls (so shells that write the value and
newline separately will set and then immediately clear the attribute,
requiring use of echo -n to set the attribute), whereas trying to use
echo -n "" to clear the attribute causes the shell to skip the
write() call altogether since POSIX says that a zero-length write
causes no side effects. Thus, one must use echo -n to set and echo
without -n to clear, as in the following example:
$ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate
unconfined_u:object_r:user_home_t:s0
$ echo "" > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate

Note the use of /proc/$$ rather than /proc/self, as otherwise
the cat command will read its own attribute value, not that of the shell.

There are no users of this facility to my knowledge; possibly we
should just get rid of it.

UPDATE: Upon further investigation it appears that a local process
with the process:setfscreate permission can cause a kernel panic as a
result of this bug.  This patch fixes CVE-2017-2618.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
[PM: added the update about CVE-2017-2618 to the commit description]
Cc: stable@vger.kernel.org # 3.5: d6ea83ecSigned-off-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

0c461cb7

selinux: allow changing labels for cgroupfs · 1ea0ce40

由 Antonio Murdaca 提交于 2月 02, 2017

This patch allows changing labels for cgroup mounts. Previously, running
chcon on cgroupfs would throw an "Operation not supported". This patch
specifically whitelist cgroupfs.

The patch could also allow containers to write only to the systemd cgroup
for instance, while the other cgroups are kept with cgroup_t label.
Signed-off-by: NAntonio Murdaca <runcom@redhat.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

1ea0ce40

selinux: fix off-by-one in setprocattr · a050a570

由 Stephen Smalley 提交于 1月 31, 2017

SELinux tries to support setting/clearing of /proc/pid/attr attributes
from the shell by ignoring terminating newlines and treating an
attribute value that begins with a NUL or newline as an attempt to
clear the attribute.  However, the test for clearing attributes has
always been wrong; it has an off-by-one error, and this could further
lead to reading past the end of the allocated buffer since commit
bb646cdb ("proc_pid_attr_write():
switch to memdup_user()").  Fix the off-by-one error.

Even with this fix, setting and clearing /proc/pid/attr attributes
from the shell is not straightforward since the interface does not
support multiple write() calls (so shells that write the value and
newline separately will set and then immediately clear the attribute,
requiring use of echo -n to set the attribute), whereas trying to use
echo -n "" to clear the attribute causes the shell to skip the
write() call altogether since POSIX says that a zero-length write
causes no side effects. Thus, one must use echo -n to set and echo
without -n to clear, as in the following example:
$ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate
unconfined_u:object_r:user_home_t:s0
$ echo "" > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate

Note the use of /proc/$$ rather than /proc/self, as otherwise
the cat command will read its own attribute value, not that of the shell.

There are no users of this facility to my knowledge; possibly we
should just get rid of it.

UPDATE: Upon further investigation it appears that a local process
with the process:setfscreate permission can cause a kernel panic as a
result of this bug.  This patch fixes CVE-2017-2618.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
[PM: added the update about CVE-2017-2618 to the commit description]
Cc: stable@vger.kernel.org # 3.5: d6ea83ecSigned-off-by: NPaul Moore <paul@paul-moore.com>

a050a570

25 1月, 2017 1 次提交

Introduce a sysctl that modifies the value of PROT_SOCK. · 4548b683

由 Krister Johansen 提交于 1月 20, 2017

Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
that denotes the first unprivileged inet port in the namespace. To
disable all privileged ports set this to zero. It also checks for
overlap with the local port range. The privileged and local range may
not overlap.

The use case for this change is to allow containerized processes to bind
to priviliged ports, but prevent them from ever being allowed to modify
their container's network configuration. The latter is accomplished by
ensuring that the network namespace is not a child of the user
namespace. This modification was needed to allow the container manager
to disable a namespace's priviliged port restrictions without exposing
control of the network namespace to processes in the user namespace.
Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4548b683

24 1月, 2017 1 次提交

exec: Remove LSM_UNSAFE_PTRACE_CAP · 9227dd2a

由 Eric W. Biederman 提交于 1月 23, 2017

With previous changes every location that tests for
LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the
LSM_UNSAFE_PTRACE_CAP redundant, so remove it.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

9227dd2a

19 1月, 2017 1 次提交

LSM: Add /sys/kernel/security/lsm · d69dece5

由 Casey Schaufler 提交于 1月 18, 2017

I am still tired of having to find indirect ways to determine
what security modules are active on a system. I have added
/sys/kernel/security/lsm, which contains a comma separated
list of the active security modules. No more groping around
in /proc/filesystems or other clever hacks.

Unchanged from previous versions except for being updated
to the latest security next branch.
Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
Acked-by: NJohn Johansen <john.johansen@canonical.com>
Acked-by: NPaul Moore <paul@paul-moore.com>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

d69dece5

13 1月, 2017 2 次提交

security,selinux,smack: kill security_task_wait hook · 3a2f5a59

由 Stephen Smalley 提交于 1月 10, 2017

As reported by yangshukui, a permission denial from security_task_wait()
can lead to a soft lockup in zap_pid_ns_processes() since it only expects
sys_wait4() to return 0 or -ECHILD. Further, security_task_wait() can
in general lead to zombies; in the absence of some way to automatically
reparent a child process upon a denial, the hook is not useful.  Remove
the security hook and its implementations in SELinux and Smack.  Smack
already removed its check from its hook.
Reported-by: Nyangshukui <yangshukui@huawei.com>
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

3a2f5a59

selinux: drop unused socket security classes · b4ba35c7

由 Stephen Smalley 提交于 1月 11, 2017

Several of the extended socket classes introduced by
commit da69a530 ("selinux: support distinctions
among all network address families") are never used because
sockets can never be created with the associated address family.
Remove these unused socket security classes.  The removed classes
are bridge_socket for PF_BRIDGE, ib_socket for PF_IB, and mpls_socket
for PF_MPLS.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

b4ba35c7

09 1月, 2017 6 次提交

proc,security: move restriction on writing /proc/pid/attr nodes to proc · b21507e2

由 Stephen Smalley 提交于 1月 09, 2017

Processes can only alter their own security attributes via
/proc/pid/attr nodes.  This is presently enforced by each individual
security module and is also imposed by the Linux credentials
implementation, which only allows a task to alter its own credentials.
Move the check enforcing this restriction from the individual
security modules to proc_pid_attr_write() before calling the security hook,
and drop the unnecessary task argument to the security hook since it can
only ever be the current task.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
Acked-by: NJohn Johansen <john.johansen@canonical.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

b21507e2

selinux: clean up cred usage and simplify · be0554c9

由 Stephen Smalley 提交于 1月 09, 2017

SELinux was sometimes using the task "objective" credentials when
it could/should use the "subjective" credentials.  This was sometimes
hidden by the fact that we were unnecessarily passing around pointers
to the current task, making it appear as if the task could be something
other than current, so eliminate all such passing of current.  Inline
various permission checking helper functions that can be reduced to a
single avc_has_perm() call.

Since the credentials infrastructure only allows a task to alter
its own credentials, we can always assume that current must be the same
as the target task in selinux_setprocattr after the check. We likely
should move this check from selinux_setprocattr() to proc_pid_attr_write()
and drop the task argument to the security hook altogether; it can only
serve to confuse things.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

be0554c9

selinux: allow context mounts on tmpfs, ramfs, devpts within user namespaces · 01593d32

由 Stephen Smalley 提交于 1月 09, 2017

commit aad82892 ("selinux: Add support for
unprivileged mounts from user namespaces") prohibited any use of context
mount options within non-init user namespaces.  However, this breaks
use of context mount options for tmpfs mounts within user namespaces,
which are being used by Docker/runc.  There is no reason to block such
usage for tmpfs, ramfs or devpts.  Exempt these filesystem types
from this restriction.

Before:
sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
mount: tmpfs is write-protected, mounting read-only
mount: cannot mount tmpfs read-only

After:
sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
sh# ls -Zd /tmp
unconfined_u:object_r:user_tmp_t:s0:c13 /tmp
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

01593d32

selinux: handle ICMPv6 consistently with ICMP · ef37979a

由 Stephen Smalley 提交于 1月 09, 2017

commit 79c8b348f215 ("selinux: support distinctions among all network
address families") mapped datagram ICMP sockets to the new icmp_socket
security class, but left ICMPv6 sockets unchanged. This change fixes
that oversight to handle both kinds of sockets consistently.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

ef37979a

selinux: add security in-core xattr support for tracefs · a2c7c6fb

由 Yongqin Liu 提交于 1月 09, 2017

Since kernel 4.1 ftrace is supported as a new separate filesystem. It
gets automatically mounted by the kernel under the old path
/sys/kernel/debug/tracing. Because it lives now on a separate filesystem
SELinux needs to be updated to also support setting SELinux labels
on tracefs inodes.  This is required for compatibility in Android
when moving to Linux 4.1 or newer.
Signed-off-by: NYongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: NWilliam Roberts <william.c.roberts@intel.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

a2c7c6fb

selinux: support distinctions among all network address families · da69a530

由 Stephen Smalley 提交于 1月 09, 2017

Extend SELinux to support distinctions among all network address families
implemented by the kernel by defining new socket security classes
and mapping to them. Otherwise, many sockets are mapped to the generic
socket class and are indistinguishable in policy. This has come up
previously with regard to selectively allowing access to bluetooth sockets,
and more recently with regard to selectively allowing access to AF_ALG
sockets. Guido Trentalancia submitted a patch that took a similar approach
to add only support for distinguishing AF_ALG sockets, but this generalizes
his approach to handle all address families implemented by the kernel.
Socket security classes are also added for ICMP and SCTP sockets.
Socket security classes were not defined for AF_* values that are reserved
but unimplemented in the kernel, e.g. AF_NETBEUI, AF_SECURITY, AF_ASH,
AF_ECONET, AF_SNA, AF_WANPIPE.

Backward compatibility is provided by only enabling the finer-grained
socket classes if a new policy capability is set in the policy; older
policies will behave as before. The legacy redhat1 policy capability
that was only ever used in testing within Fedora for ptrace_child
is reclaimed for this purpose; as far as I can tell, this policy
capability is not enabled in any supported distro policy.

Add a pair of conditional compilation guards to detect when new AF_* values
are added so that we can update SELinux accordingly rather than having to
belatedly update it long after new address families are introduced.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

da69a530

23 11月, 2016 1 次提交

selinux: Convert isec->lock into a spinlock · 9287aed2

由 Andreas Gruenbacher 提交于 11月 15, 2016

Convert isec->lock from a mutex into a spinlock.  Instead of holding
the lock while sleeping in inode_doinit_with_dentry, set
isec->initialized to LABEL_PENDING and release the lock.  Then, when
the sid has been determined, re-acquire the lock.  If isec->initialized
is still set to LABEL_PENDING, set isec->sid; otherwise, the sid has
been set by another task (LABEL_INITIALIZED) or invalidated
(LABEL_INVALID) in the meantime.

This fixes a deadlock on gfs2 where

 * one task is in inode_doinit_with_dentry -> gfs2_getxattr, holds
   isec->lock, and tries to acquire the inode's glock, and

 * another task is in do_xmote -> inode_go_inval ->
   selinux_inode_invalidate_secctx, holds the inode's glock, and
   tries to acquire isec->lock.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
[PM: minor tweaks to keep checkpatch.pl happy]
Signed-off-by: NPaul Moore <paul@paul-moore.com>

9287aed2

16 11月, 2016 1 次提交

posix-timers: Make them configurable · baa73d9e

由 Nicolas Pitre 提交于 11月 11, 2016

Some embedded systems have no use for them.  This removes about
25KB from the kernel binary size when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime, setitimer, getitimer, alarm.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep
syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast
majority of use cases with very little code.
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Acked-by: NRichard Cochran <richardcochran@gmail.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJohn Stultz <john.stultz@linaro.org>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: linux-kbuild@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Michal Marek <mmarek@suse.com>
Cc: Edward Cree <ecree@solarflare.com>
Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

baa73d9e

15 11月, 2016 4 次提交

selinux: Clean up initialization of isec->sclass · 13457d07

由 Andreas Gruenbacher 提交于 11月 10, 2016

Now that isec->initialized == LABEL_INITIALIZED implies that
isec->sclass is valid, skip such inodes immediately in
inode_doinit_with_dentry.

For the remaining inodes, initialize isec->sclass at the beginning of
inode_doinit_with_dentry to simplify the code.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

13457d07

proc: Pass file mode to proc_pid_make_inode · db978da8

由 Andreas Gruenbacher 提交于 11月 10, 2016

Pass the file mode of the proc inode to be created to
proc_pid_make_inode.  In proc_pid_make_inode, initialize inode->i_mode
before calling security_task_to_inode.  This allows selinux to set
isec->sclass right away without introducing "half-initialized" inode
security structs.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

db978da8

selinux: Minor cleanups · 42059112

由 Andreas Gruenbacher 提交于 11月 10, 2016

Fix the comment for function __inode_security_revalidate, which returns
an integer.

Use the LABEL_* constants consistently for isec->initialized.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

42059112

SELinux: Use GFP_KERNEL for selinux_parse_opts_str(). · 8931c3bd

由 Tetsuo Handa 提交于 11月 14, 2016

Since selinux_parse_opts_str() is calling match_strdup() which uses
GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is
called by selinux_parse_opts_str().
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

8931c3bd

20 10月, 2016 1 次提交

mm: Change vm_is_stack_for_task() to vm_is_stack_for_current() · d17af505

由 Andy Lutomirski 提交于 9月 30, 2016

Asking for a non-current task's stack can't be done without races
unless the task is frozen in kernel mode.  As far as I know,
vm_is_stack_for_task() never had a safe non-current use case.

The __unused annotation is because some KSTK_ESP implementations
ignore their parameter, which IMO is further justification for this
patch.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

d17af505

08 10月, 2016 1 次提交

xattr: Add __vfs_{get,set,remove}xattr helpers · 5d6c3191

由 Andreas Gruenbacher 提交于 9月 29, 2016

Right now, various places in the kernel check for the existence of
getxattr, setxattr, and removexattr inode operations and directly call
those operations.  Switch to helper functions and test for the IOP_XATTR
flag instead.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5d6c3191

20 9月, 2016 1 次提交

lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE · 43af5de7

由 Vivek Goyal 提交于 9月 09, 2016

Right now LSM_AUDIT_DATA_PATH type contains "struct path" in union "u"
of common_audit_data. This information is used to print path of file
at the same time it is also used to get to dentry and inode. And this
inode information is used to get to superblock and device and print
device information.

This does not work well for layered filesystems like overlay where dentry
contained in path is overlay dentry and not the real dentry of underlying
file system. That means inode retrieved from dentry is also overlay
inode and not the real inode.

SELinux helpers like file_path_has_perm() are doing checks on inode
retrieved from file_inode(). This returns the real inode and not the
overlay inode. That means we are doing check on real inode but for audit
purposes we are printing details of overlay inode and that can be
confusing while debugging.

Hence, introduce a new type LSM_AUDIT_DATA_FILE which carries file
information and inode retrieved is real inode using file_inode(). That
way right avc denied information is given to user.

For example, following is one example avc before the patch.

  type=AVC msg=audit(1473360868.399:214): avc:  denied  { read open } for
    pid=1765 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="overlay" ino=21443
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

It looks as follows after the patch.

  type=AVC msg=audit(1473360017.388:282): avc:  denied  { read open } for
    pid=2530 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="dm-0" ino=2377915
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

Notice that now dev information points to "dm-0" device instead of
"overlay" device. This makes it clear that check failed on underlying
inode and not on the overlay inode.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
[PM: slight tweaks to the description to make checkpatch.pl happy]
Signed-off-by: NPaul Moore <paul@paul-moore.com>

43af5de7

10 8月, 2016 1 次提交

selinux: Implement dentry_create_files_as() hook · a518b0a5

由 Vivek Goyal 提交于 7月 13, 2016

Calculate what would be the label of newly created file and set that
secid in the passed creds.

Context of the task which is actually creating file is retrieved from
set of creds passed in. (old->security).
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

a518b0a5

09 8月, 2016 4 次提交

selinux: Pass security pointer to determine_inode_label() · c957f6df

由 Vivek Goyal 提交于 7月 13, 2016

Right now selinux_determine_inode_label() works on security pointer of
current task. Soon I need this to work on a security pointer retrieved
from a set of creds. So start passing in a pointer and caller can
decide where to fetch security pointer from.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

c957f6df

selinux: Implementation for inode_copy_up_xattr() hook · 19472b69

由 Vivek Goyal 提交于 7月 13, 2016

When a file is copied up in overlay, we have already created file on
upper/ with right label and there is no need to copy up selinux
label/xattr from lower file to upper file. In fact in case of context
mount, we don't want to copy up label as newly created file got its label
from context= option.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

19472b69

selinux: Implementation for inode_copy_up() hook · 56909eb3

由 Vivek Goyal 提交于 7月 13, 2016

A file is being copied up for overlay file system. Prepare a new set of
creds and set create_sid appropriately so that new file is created with
appropriate label.

Overlay inode has right label for both context and non-context mount
cases. In case of non-context mount, overlay inode will have the label
of lower file and in case of context mount, overlay inode will have
the label from context= mount option.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

56909eb3

security: Use IS_ENABLED() instead of checking for built-in or module · 1a93a6ea

由 Javier Martinez Canillas 提交于 8月 08, 2016

The IS_ENABLED() macro checks if a Kconfig symbol has been enabled
either built-in or as a module, use that macro instead of open coding
the same.
Signed-off-by: NJavier Martinez Canillas <javier@osg.samsung.com>
Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

1a93a6ea

21 7月, 2016 1 次提交
- A
  qstr: constify dentry_init_security · 4f3ccd76
  由 Al Viro 提交于 7月 20, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  4f3ccd76
28 6月, 2016 2 次提交

netlabel: Pass a family parameter to netlbl_skbuff_err(). · a04e71f6

由 Huw Davies 提交于 6月 27, 2016

This makes it possible to route the error to the appropriate
labelling engine.  CALIPSO is far less verbose than CIPSO
when encountering a bogus packet, so there is no need for a
CALIPSO error handler.
Signed-off-by: NHuw Davies <huw@codeweavers.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

a04e71f6

calipso: Allow the lsm to label the skbuff directly. · 2917f57b

由 Huw Davies 提交于 6月 27, 2016

In some cases, the lsm needs to add the label to the skbuff directly.
A NF_INET_LOCAL_OUT IPv6 hook is added to selinux to match the IPv4
behaviour.  This allows selinux to label the skbuffs that it requires.
Signed-off-by: NHuw Davies <huw@codeweavers.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

2917f57b

25 6月, 2016 1 次提交

selinux: Add support for unprivileged mounts from user namespaces · aad82892

由 Seth Forshee 提交于 4月 26, 2016

Security labels from unprivileged mounts in user namespaces must
be ignored. Force superblocks from user namespaces whose labeling
behavior is to use xattrs to use mountpoint labeling instead.
For the mountpoint label, default to converting the current task
context into a form suitable for file objects, but also allow the
policy writer to specify a different label through policy
transition rules.

Pieced together from code snippets provided by Stephen Smalley.
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

aad82892

24 6月, 2016 1 次提交

fs: Treat foreign mounts as nosuid · 380cf5ba

由 Andy Lutomirski 提交于 6月 23, 2016

If a process gets access to a mount from a different user
namespace, that process should not be able to take advantage of
setuid files or selinux entrypoints from that filesystem.  Prevent
this by treating mounts from other mount namespaces and those not
owned by current_user_ns() or an ancestor as nosuid.

This will make it safer to allow more complex filesystems to be
mounted in non-root user namespaces.

This does not remove the need for MNT_LOCK_NOSUID.  The setuid,
setgid, and file capability bits can no longer be abused if code in
a user namespace were to clear nosuid on an untrusted filesystem,
but this patch, by itself, is insufficient to protect the system
from abuse of files that, when execed, would increase MAC privilege.

As a more concrete explanation, any task that can manipulate a
vfsmount associated with a given user namespace already has
capabilities in that namespace and all of its descendents.  If they
can cause a malicious setuid, setgid, or file-caps executable to
appear in that mount, then that executable will only allow them to
elevate privileges in exactly the set of namespaces in which they
are already privileges.

On the other hand, if they can cause a malicious executable to
appear with a dangerous MAC label, running it could change the
caller's security context in a way that should not have been
possible, even inside the namespace in which the task is confined.

As a hardening measure, this would have made CVE-2014-5207 much
more difficult to exploit.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

380cf5ba