提交 · b1d749c5c34112fab5902c43b2a37a0ba1e5f0f1 · openeuler / Kernel

25 5月, 2018 1 次提交

capabilities: Allow privileged user in s_user_ns to set security.* xattrs · b1d749c5

由 Eric W. Biederman 提交于 4月 21, 2017

A privileged user in s_user_ns will generally have the ability to
manipulate the backing store and insert security.* xattrs into
the filesystem directly. Therefore the kernel must be prepared to
handle these xattrs from unprivileged mounts, and it makes little
sense for commoncap to prevent writing these xattrs to the
filesystem. The capability and LSM code have already been updated
to appropriately handle xattrs from unprivileged mounts, so it
is safe to loosen this restriction on setting xattrs.

The exception to this logic is that writing xattrs to a mounted
filesystem may also cause the LSM inode_post_setxattr or
inode_setsecurity callbacks to be invoked. SELinux will deny the
xattr update by virtue of applying mountpoint labeling to
unprivileged userns mounts, and Smack will deny the writes for
any user without global CAP_MAC_ADMIN, so loosening the
capability check in commoncap is safe in this respect as well.
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Acked-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NChristian Brauner <christian@brauner.io>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

b1d749c5

11 4月, 2018 1 次提交

commoncap: Handle memory allocation failure. · 1f578172

由 Tetsuo Handa 提交于 4月 10, 2018

syzbot is reporting NULL pointer dereference at xattr_getsecurity() [1],
for cap_inode_getsecurity() is returning sizeof(struct vfs_cap_data) when
memory allocation failed. Return -ENOMEM if memory allocation failed.

[1] https://syzkaller.appspot.com/bug?id=a55ba438506fe68649a5f50d2d82d56b365e0107Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities")
Reported-by: Nsyzbot <syzbot+9369930ca44f29e60e2d@syzkaller.appspotmail.com>
Cc: stable <stable@vger.kernel.org> # 4.14+
Acked-by: NSerge E. Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.morris@microsoft.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

1f578172

02 1月, 2018 1 次提交

capabilities: fix buffer overread on very short xattr · dc32b5c3

由 Eric Biggers 提交于 1月 01, 2018

If userspace attempted to set a "security.capability" xattr shorter than
4 bytes (e.g. 'setfattr -n security.capability -v x file'), then
cap_convert_nscap() read past the end of the buffer containing the xattr
value because it accessed the ->magic_etc field without verifying that
the xattr value is long enough to contain that field.

Fix it by validating the xattr value size first.

This bug was found using syzkaller with KASAN.  The KASAN report was as
follows (cleaned up slightly):

    BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498
    Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852

    CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    Call Trace:
     __dump_stack lib/dump_stack.c:17 [inline]
     dump_stack+0xe3/0x195 lib/dump_stack.c:53
     print_address_description+0x73/0x260 mm/kasan/report.c:252
     kasan_report_error mm/kasan/report.c:351 [inline]
     kasan_report+0x235/0x350 mm/kasan/report.c:409
     cap_convert_nscap+0x514/0x630 security/commoncap.c:498
     setxattr+0x2bd/0x350 fs/xattr.c:446
     path_setxattr+0x168/0x1b0 fs/xattr.c:472
     SYSC_setxattr fs/xattr.c:487 [inline]
     SyS_setxattr+0x36/0x50 fs/xattr.c:483
     entry_SYSCALL_64_fastpath+0x18/0x85

Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities")
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

dc32b5c3

20 10月, 2017 10 次提交

capabilities: audit log other surprising conditions · dbbbe110

由 Richard Guy Briggs 提交于 10月 11, 2017

The existing condition tested for process effective capabilities set by
file attributes but intended to ignore the change if the result was
unsurprisingly an effective full set in the case root is special with a
setuid root executable file and we are root.

Stated again:
- When you execute a setuid root application, it is no surprise and
  expected that it got all capabilities, so we do not want capabilities
  recorded.
        if (pE_grew && !(pE_fullset && (eff_root || real_root) && root_priveleged) )

Now make sure we cover other cases:
- If something prevented a setuid root app getting all capabilities and
  it wound up with one capability only, then it is a surprise and should
  be logged.  When it is a setuid root file, we only want capabilities
  when the process does not get full capabilities..
        root_priveleged && setuid_root && !pE_fullset

- Similarly if a non-setuid program does pick up capabilities due to
  file system based capabilities, then we want to know what capabilities
  were picked up.  When it has file system based capabilities we want
  the capabilities.
        !is_setuid && (has_fcap && pP_gained)

- If it is a non-setuid file and it gets ambient capabilities, we want
  the capabilities.
        !is_setuid && pA_gained

- These last two are combined into one due to the common first parameter.

Related: https://github.com/linux-audit/audit-kernel/issues/16Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Acked-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

dbbbe110

capabilities: fix logic for effective root or real root · 588fb2c7

由 Richard Guy Briggs 提交于 10月 11, 2017

Now that the logic is inverted, it is much easier to see that both real
root and effective root conditions had to be met to avoid printing the
BPRM_FCAPS record with audit syscalls.  This meant that any setuid root
applications would print a full BPRM_FCAPS record when it wasn't
necessary, cluttering the event output, since the SYSCALL and PATH
records indicated the presence of the setuid bit and effective root user
id.

Require only one of effective root or real root to avoid printing the
unnecessary record.

Ref: commit 3fc689e9 ("Add audit_log_bprm_fcaps/AUDIT_BPRM_FCAPS")
See: https://github.com/linux-audit/audit-kernel/issues/16Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Acked-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

588fb2c7

capabilities: invert logic for clarity · c0d1adef

由 Richard Guy Briggs 提交于 10月 11, 2017

The way the logic was presented, it was awkward to read and verify.
Invert the logic using DeMorgan's Law to be more easily able to read and
understand.
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

c0d1adef

capabilities: remove a layer of conditional logic · 02ebbaf4

由 Richard Guy Briggs 提交于 10月 11, 2017

Remove a layer of conditional logic to make the use of conditions
easier to read and analyse.
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

02ebbaf4

capabilities: move audit log decision to function · 9fbc2c79

由 Richard Guy Briggs 提交于 10月 11, 2017

Move the audit log decision logic to its own function to isolate the
complexity in one place.
Suggested-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

9fbc2c79

capabilities: use intuitive names for id changes · 81a6a012

由 Richard Guy Briggs 提交于 10月 11, 2017

Introduce a number of inlines to make the use of the negation of
uid_eq() easier to read and analyse.
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

81a6a012

capabilities: use root_priveleged inline to clarify logic · 9304b46c

由 Richard Guy Briggs 提交于 10月 11, 2017

Introduce inline root_privileged() to make use of SECURE_NONROOT
easier to read.
Suggested-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

9304b46c

capabilities: rename has_cap to has_fcap · fc7eadf7

由 Richard Guy Briggs 提交于 10月 11, 2017

Rename has_cap to has_fcap to clarify it applies to file capabilities
since the entire source file is about capabilities.
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

fc7eadf7

capabilities: intuitive names for cap gain status · 4c7e715f

由 Richard Guy Briggs 提交于 10月 11, 2017

Introduce macros cap_gained, cap_grew, cap_full to make the use of the
negation of is_subset() easier to read and analyse.
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

4c7e715f

capabilities: factor out cap_bprm_set_creds privileged root · db1a8922

由 Richard Guy Briggs 提交于 10月 11, 2017

Factor out the case of privileged root from the function
cap_bprm_set_creds() to make the latter easier to read and analyse.
Suggested-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NKees Cook <keescook@chromium.org>
Okay-ished-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

db1a8922

19 10月, 2017 1 次提交

commoncap: move assignment of fs_ns to avoid null pointer dereference · 76ba89c7

由 Colin Ian King 提交于 9月 04, 2017

The pointer fs_ns is assigned from inode->i_ib->s_user_ns before
a null pointer check on inode, hence if inode is actually null we
will get a null pointer dereference on this assignment. Fix this
by only dereferencing inode after the null pointer check on
inode.

Detected by CoverityScan CID#1455328 ("Dereference before null check")

Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Cc: stable@vger.kernel.org
Acked-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

76ba89c7

24 9月, 2017 1 次提交

security: fix description of values returned by cap_inode_need_killpriv · ab5348c9

由 Stefan Berger 提交于 7月 26, 2017

cap_inode_need_killpriv returns 1 if security.capability exists and
has a value and inode_killpriv() is required, 0 otherwise. Fix the
description of the return value to reflect this.
Signed-off-by: NStefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

ab5348c9

02 9月, 2017 1 次提交

Introduce v3 namespaced file capabilities · 8db6c34f

由 Serge E. Hallyn 提交于 5月 08, 2017

Root in a non-initial user ns cannot be trusted to write a traditional
security.capability xattr.  If it were allowed to do so, then any
unprivileged user on the host could map his own uid to root in a private
namespace, write the xattr, and execute the file with privilege on the
host.

However supporting file capabilities in a user namespace is very
desirable.  Not doing so means that any programs designed to run with
limited privilege must continue to support other methods of gaining and
dropping privilege.  For instance a program installer must detect
whether file capabilities can be assigned, and assign them if so but set
setuid-root otherwise.  The program in turn must know how to drop
partial capabilities, and do so only if setuid-root.

This patch introduces v3 of the security.capability xattr.  It builds a
vfs_ns_cap_data struct by appending a uid_t rootid to struct
vfs_cap_data.  This is the absolute uid_t (that is, the uid_t in user
namespace which mounted the filesystem, usually init_user_ns) of the
root id in whose namespaces the file capabilities may take effect.

When a task asks to write a v2 security.capability xattr, if it is
privileged with respect to the userns which mounted the filesystem, then
nothing should change.  Otherwise, the kernel will transparently rewrite
the xattr as a v3 with the appropriate rootid.  This is done during the
execution of setxattr() to catch user-space-initiated capability writes.
Subsequently, any task executing the file which has the noted kuid as
its root uid, or which is in a descendent user_ns of such a user_ns,
will run the file with capabilities.

Similarly when asking to read file capabilities, a v3 capability will
be presented as v2 if it applies to the caller's namespace.

If a task writes a v3 security.capability, then it can provide a uid for
the xattr so long as the uid is valid in its own user namespace, and it
is privileged with CAP_SETFCAP over its namespace.  The kernel will
translate that rootid to an absolute uid, and write that to disk.  After
this, a task in the writer's namespace will not be able to use those
capabilities (unless rootid was 0), but a task in a namespace where the
given uid is root will.

Only a single security.capability xattr may exist at a time for a given
file.  A task may overwrite an existing xattr so long as it is
privileged over the inode.  Note this is a departure from previous
semantics, which required privilege to remove a security.capability
xattr.  This check can be re-added if deemed useful.

This allows a simple setxattr to work, allows tar/untar to work, and
allows us to tar in one namespace and untar in another while preserving
the capability, without risking leaking privilege into a parent
namespace.

Example using tar:

 $ cp /bin/sleep sleepx
 $ mkdir b1 b2
 $ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1
 $ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2
 $ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx
 $ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar
 $ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx
   b2/sleepx = cap_sys_admin+ep
 # /opt/ltp/testcases/bin/getv3xattr b2/sleepx
   v3 xattr, rootid is 100001

A patch to linux-test-project adding a new set of tests for this
functionality is in the nsfscaps branch at github.com/hallyn/ltp

Changelog:
   Nov 02 2016: fix invalid check at refuse_fcap_overwrite()
   Nov 07 2016: convert rootid from and to fs user_ns
   (From ebiederm: mar 28 2017)
     commoncap.c: fix typos - s/v4/v3
     get_vfs_caps_from_disk: clarify the fs_ns root access check
     nsfscaps: change the code split for cap_inode_setxattr()
   Apr 09 2017:
       don't return v3 cap for caps owned by current root.
      return a v2 cap for a true v2 cap in non-init ns
   Apr 18 2017:
      . Change the flow of fscap writing to support s_user_ns writing.
      . Remove refuse_fcap_overwrite().  The value of the previous
        xattr doesn't matter.
   Apr 24 2017:
      . incorporate Eric's incremental diff
      . move cap_convert_nscap to setxattr and simplify its usage
   May 8, 2017:
      . fix leaking dentry refcount in cap_inode_getsecurity
Signed-off-by: NSerge Hallyn <serge@hallyn.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

8db6c34f

02 8月, 2017 2 次提交

commoncap: Move cap_elevated calculation into bprm_set_creds · ee67ae7e

由 Kees Cook 提交于 7月 18, 2017

Instead of a separate function, open-code the cap_elevated test, which
lets us entirely remove bprm->cap_effective (to use the local "effective"
variable instead), and more accurately examine euid/egid changes via the
existing local "is_setid".

The following LTP tests were run to validate the changes:

	# ./runltp -f syscalls -s cap
	# ./runltp -f securebits
	# ./runltp -f cap_bounds
	# ./runltp -f filecaps

All kernel selftests for capabilities and exec continue to pass as well.
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NSerge Hallyn <serge@hallyn.com>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>

ee67ae7e

commoncap: Refactor to remove bprm_secureexec hook · 46d98eb4

由 Kees Cook 提交于 7月 18, 2017

The commoncap implementation of the bprm_secureexec hook is the only LSM
that depends on the final call to its bprm_set_creds hook (since it may
be called for multiple files, it ignores bprm->called_set_creds). As a
result, it cannot safely _clear_ bprm->secureexec since other LSMs may
have set it. Instead, remove the bprm_secureexec hook by introducing a
new flag to bprm specific to commoncap: cap_elevated. This is similar to
cap_effective, but that is used for a specific subset of elevated
privileges, and exists solely to track state from bprm_set_creds to
bprm_secureexec. As such, it will be removed in the next patch.

Here, set the new bprm->cap_elevated flag when setuid/setgid has happened
from bprm_fill_uid() or fscapabilities have been prepared. This temporarily
moves the bprm_secureexec hook to a static inline. The helper will be
removed in the next patch; this makes the step easier to review and bisect,
since this does not introduce any changes to inputs nor outputs to the
"elevated privileges" calculation.

The new flag is merged with the bprm->secureexec flag in setup_new_exec()
since this marks the end of any further prepare_binprm() calls.

Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NSerge Hallyn <serge@hallyn.com>

46d98eb4

20 7月, 2017 1 次提交

security: Use user_namespace::level to avoid redundant iterations in cap_capable() · 64db4c7f

由 Kirill Tkhai 提交于 5月 02, 2017

When ns->level is not larger then cred->user_ns->level,
then ns can't be cred->user_ns's descendant, and
there is no a sense to search in parents.

So, break the cycle earlier and skip needless iterations.

v2: Change comment on suggested by Andy Lutomirski.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

64db4c7f

06 3月, 2017 1 次提交

security: mark LSM hooks as __ro_after_init · ca97d939

由 James Morris 提交于 2月 15, 2017

Mark all of the registration hooks as __ro_after_init (via the
__lsm_ro_after_init macro).
Signed-off-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NKees Cook <keescook@chromium.org>

ca97d939

24 1月, 2017 3 次提交

exec: Remove LSM_UNSAFE_PTRACE_CAP · 9227dd2a

由 Eric W. Biederman 提交于 1月 23, 2017

With previous changes every location that tests for
LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the
LSM_UNSAFE_PTRACE_CAP redundant, so remove it.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

9227dd2a

exec: Test the ptracer's saved cred to see if the tracee can gain caps · 20523132

由 Eric W. Biederman 提交于 1月 23, 2017

Now that we have user namespaces and non-global capabilities verify
the tracer has capabilities in the relevant user namespace instead
of in the current_user_ns().

As the test for setting LSM_UNSAFE_PTRACE_CAP is currently
ptracer_capable(p, current_user_ns()) and the new task credentials are
in current_user_ns() this change does not have any user visible change
and simply moves the test to where it is used, making the code easier
to read.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

20523132

exec: Don't reset euid and egid when the tracee has CAP_SETUID · 70169420

由 Eric W. Biederman 提交于 11月 17, 2016

Don't reset euid and egid when the tracee has CAP_SETUID in
it's user namespace. I punted on relaxing this permission check
long ago but now that I have read this code closely it is clear
it is safe to test against CAP_SETUID in the user namespace.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

70169420

19 1月, 2017 1 次提交

LSM: Add /sys/kernel/security/lsm · d69dece5

由 Casey Schaufler 提交于 1月 18, 2017

I am still tired of having to find indirect ways to determine
what security modules are active on a system. I have added
/sys/kernel/security/lsm, which contains a comma separated
list of the active security modules. No more groping around
in /proc/filesystems or other clever hacks.

Unchanged from previous versions except for being updated
to the latest security next branch.
Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
Acked-by: NJohn Johansen <john.johansen@canonical.com>
Acked-by: NPaul Moore <paul@paul-moore.com>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

d69dece5

08 10月, 2016 1 次提交

xattr: Add __vfs_{get,set,remove}xattr helpers · 5d6c3191

由 Andreas Gruenbacher 提交于 9月 29, 2016

Right now, various places in the kernel check for the existence of
getxattr, setxattr, and removexattr inode operations and directly call
those operations.  Switch to helper functions and test for the IOP_XATTR
flag instead.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5d6c3191

24 6月, 2016 2 次提交

fs: Treat foreign mounts as nosuid · 380cf5ba

由 Andy Lutomirski 提交于 6月 23, 2016

If a process gets access to a mount from a different user
namespace, that process should not be able to take advantage of
setuid files or selinux entrypoints from that filesystem.  Prevent
this by treating mounts from other mount namespaces and those not
owned by current_user_ns() or an ancestor as nosuid.

This will make it safer to allow more complex filesystems to be
mounted in non-root user namespaces.

This does not remove the need for MNT_LOCK_NOSUID.  The setuid,
setgid, and file capability bits can no longer be abused if code in
a user namespace were to clear nosuid on an untrusted filesystem,
but this patch, by itself, is insufficient to protect the system
from abuse of files that, when execed, would increase MAC privilege.

As a more concrete explanation, any task that can manipulate a
vfsmount associated with a given user namespace already has
capabilities in that namespace and all of its descendents.  If they
can cause a malicious setuid, setgid, or file-caps executable to
appear in that mount, then that executable will only allow them to
elevate privileges in exactly the set of namespaces in which they
are already privileges.

On the other hand, if they can cause a malicious executable to
appear with a dangerous MAC label, running it could change the
caller's security context in a way that should not have been
possible, even inside the namespace in which the task is confined.

As a hardening measure, this would have made CVE-2014-5207 much
more difficult to exploit.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Acked-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

380cf5ba

fs: Limit file caps to the user namespace of the super block · d07b846f

由 Seth Forshee 提交于 9月 23, 2015

Capability sets attached to files must be ignored except in the
user namespaces where the mounter is privileged, i.e. s_user_ns
and its descendants. Otherwise a vector exists for gaining
privileges in namespaces where a user is not already privileged.

Add a new helper function, current_in_user_ns(), to test whether a user
namespace is the same as or a descendant of another namespace.
Use this helper to determine whether a file's capability set
should be applied to the caps constructed during exec.

--EWB Replaced in_userns with the simpler current_in_userns.
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

d07b846f

23 4月, 2016 1 次提交

security: Introduce security_settime64() · 457db29b

由 Baolin Wang 提交于 4月 08, 2016

security_settime() uses a timespec, which is not year 2038 safe
on 32bit systems. Thus this patch introduces the security_settime64()
function with timespec64 type. We also convert the cap_settime() helper
function to use the 64bit types.

This patch then moves security_settime() to the header file as an
inline helper function so that existing users can be iteratively
converted.

None of the existing hooks is using the timespec argument and therefor
the patch is not making any functional changes.

Cc: Serge Hallyn <serge.hallyn@canonical.com>,
Cc: James Morris <james.l.morris@oracle.com>,
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
Cc: Paul Moore <pmoore@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Kees Cook <keescook@chromium.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: NJames Morris <james.l.morris@oracle.com>
Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
[jstultz: Reworded commit message]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

457db29b

11 4月, 2016 1 次提交
- A
  ->getxattr(): pass dentry and inode as separate arguments · ce23e640
  由 Al Viro 提交于 4月 11, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  ce23e640
21 1月, 2016 1 次提交

ptrace: use fsuid, fsgid, effective creds for fs access checks · caaee623

由 Jann Horn 提交于 1月 20, 2016

By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

 /proc/$pid/stat - uses the check for determining whether pointers
     should be visible, useful for bypassing ASLR
 /proc/$pid/maps - also useful for bypassing ASLR
 /proc/$pid/cwd - useful for gaining access to restricted
     directories that contain files with lax permissions, e.g. in
     this scenario:
     lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
     drwx------ root root /root
     drwxr-xr-x root root /root/foobar
     -rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: NJann Horn <jann@thejh.net>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

caaee623

05 9月, 2015 2 次提交

capabilities: add a securebit to disable PR_CAP_AMBIENT_RAISE · 746bf6d6

由 Andy Lutomirski 提交于 9月 04, 2015

Per Andrew Morgan's request, add a securebit to allow admins to disable
PR_CAP_AMBIENT_RAISE.  This securebit will prevent processes from adding
capabilities to their ambient set.

For simplicity, this disables PR_CAP_AMBIENT_RAISE entirely rather than
just disabling setting previously cleared bits.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NAndrew G. Morgan <morgan@kernel.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

746bf6d6

capabilities: ambient capabilities · 58319057

由 Andy Lutomirski 提交于 9月 04, 2015

Credit where credit is due: this idea comes from Christoph Lameter with
a lot of valuable input from Serge Hallyn.  This patch is heavily based
on Christoph's patch.

===== The status quo =====

On Linux, there are a number of capabilities defined by the kernel.  To
perform various privileged tasks, processes can wield capabilities that
they hold.

Each task has four capability masks: effective (pE), permitted (pP),
inheritable (pI), and a bounding set (X).  When the kernel checks for a
capability, it checks pE.  The other capability masks serve to modify
what capabilities can be in pE.

Any task can remove capabilities from pE, pP, or pI at any time.  If a
task has a capability in pP, it can add that capability to pE and/or pI.
If a task has CAP_SETPCAP, then it can add any capability to pI, and it
can remove capabilities from X.

Tasks are not the only things that can have capabilities; files can also
have capabilities.  A file can have no capabilty information at all [1].
If a file has capability information, then it has a permitted mask (fP)
and an inheritable mask (fI) as well as a single effective bit (fE) [2].
File capabilities modify the capabilities of tasks that execve(2) them.

A task that successfully calls execve has its capabilities modified for
the file ultimately being excecuted (i.e.  the binary itself if that
binary is ELF or for the interpreter if the binary is a script.) [3] In
the capability evolution rules, for each mask Z, pZ represents the old
value and pZ' represents the new value.  The rules are:

  pP' = (X & fP) | (pI & fI)
  pI' = pI
  pE' = (fE ? pP' : 0)
  X is unchanged

For setuid binaries, fP, fI, and fE are modified by a moderately
complicated set of rules that emulate POSIX behavior.  Similarly, if
euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
(primary, fP and fI usually end up being the full set).  For nonroot
users executing binaries with neither setuid nor file caps, fI and fP
are empty and fE is false.

As an extra complication, if you execute a process as nonroot and fE is
set, then the "secure exec" rules are in effect: AT_SECURE gets set,
LD_PRELOAD doesn't work, etc.

This is rather messy.  We've learned that making any changes is
dangerous, though: if a new kernel version allows an unprivileged
program to change its security state in a way that persists cross
execution of a setuid program or a program with file caps, this
persistent state is surprisingly likely to allow setuid or file-capped
programs to be exploited for privilege escalation.

===== The problem =====

Capability inheritance is basically useless.

If you aren't root and you execute an ordinary binary, fI is zero, so
your capabilities have no effect whatsoever on pP'.  This means that you
can't usefully execute a helper process or a shell command with elevated
capabilities if you aren't root.

On current kernels, you can sort of work around this by setting fI to
the full set for most or all non-setuid executable files.  This causes
pP' = pI for nonroot, and inheritance works.  No one does this because
it's a PITA and it isn't even supported on most filesystems.

If you try this, you'll discover that every nonroot program ends up with
secure exec rules, breaking many things.

This is a problem that has bitten many people who have tried to use
capabilities for anything useful.

===== The proposed change =====

This patch adds a fifth capability mask called the ambient mask (pA).
pA does what most people expect pI to do.

pA obeys the invariant that no bit can ever be set in pA if it is not
set in both pP and pI.  Dropping a bit from pP or pI drops that bit from
pA.  This ensures that existing programs that try to drop capabilities
still do so, with a complication.  Because capability inheritance is so
broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
then calling execve effectively drops capabilities.  Therefore,
setresuid from root to nonroot conditionally clears pA unless
SECBIT_NO_SETUID_FIXUP is set.  Processes that don't like this can
re-add bits to pA afterwards.

The capability evolution rules are changed:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (X & fP) | (pI & fI) | pA'
  pI' = pI
  pE' = (fE ? pP' : pA')
  X is unchanged

If you are nonroot but you have a capability, you can add it to pA.  If
you do so, your children get that capability in pA, pP, and pE.  For
example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
automatically bind low-numbered ports.  Hallelujah!

Unprivileged users can create user namespaces, map themselves to a
nonzero uid, and create both privileged (relative to their namespace)
and unprivileged process trees.  This is currently more or less
impossible.  Hallelujah!

You cannot use pA to try to subvert a setuid, setgid, or file-capped
program: if you execute any such program, pA gets cleared and the
resulting evolution rules are unchanged by this patch.

Users with nonzero pA are unlikely to unintentionally leak that
capability.  If they run programs that try to drop privileges, dropping
privileges will still work.

It's worth noting that the degree of paranoia in this patch could
possibly be reduced without causing serious problems.  Specifically, if
we allowed pA to persist across executing non-pA-aware setuid binaries
and across setresuid, then, naively, the only capabilities that could
leak as a result would be the capabilities in pA, and any attacker
*already* has those capabilities.  This would make me nervous, though --
setuid binaries that tried to privilege-separate might fail to do so,
and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
unexpected side effects.  (Whether these unexpected side effects would
be exploitable is an open question.) I've therefore taken the more
paranoid route.  We can revisit this later.

An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
ambient capabilities.  I think that this would be annoying and would
make granting otherwise unprivileged users minor ambient capabilities
(CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
it is with this patch.

===== Footnotes =====

[1] Files that are missing the "security.capability" xattr or that have
unrecognized values for that xattr end up with has_cap set to false.
The code that does that appears to be complicated for no good reason.

[2] The libcap capability mask parsers and formatters are dangerously
misleading and the documentation is flat-out wrong.  fE is *not* a mask;
it's a single bit.  This has probably confused every single person who
has tried to use file capabilities.

[3] Linux very confusingly processes both the script and the interpreter
if applicable, for reasons that elude me.  The results from thinking
about a script's file capabilities and/or setuid bits are mostly
discarded.

Preliminary userspace code is here, but it needs updating:
https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2

Here is a test program that can be used to verify the functionality
(from Christoph):

/*
 * Test program for the ambient capabilities. This program spawns a shell
 * that allows running processes with a defined set of capabilities.
 *
 * (C) 2015 Christoph Lameter <cl@linux.com>
 * Released under: GPL v3 or later.
 *
 *
 * Compile using:
 *
 *	gcc -o ambient_test ambient_test.o -lcap-ng
 *
 * This program must have the following capabilities to run properly:
 * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
 *
 * A command to equip the binary with the right caps is:
 *
 *	setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
 *
 *
 * To get a shell with additional caps that can be inherited by other processes:
 *
 *	./ambient_test /bin/bash
 *
 *
 * Verifying that it works:
 *
 * From the bash spawed by ambient_test run
 *
 *	cat /proc/$$/status
 *
 * and have a look at the capabilities.
 */

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <cap-ng.h>
#include <sys/prctl.h>
#include <linux/capability.h>

/*
 * Definitions from the kernel header files. These are going to be removed
 * when the /usr/include files have these defined.
 */
#define PR_CAP_AMBIENT 47
#define PR_CAP_AMBIENT_IS_SET 1
#define PR_CAP_AMBIENT_RAISE 2
#define PR_CAP_AMBIENT_LOWER 3
#define PR_CAP_AMBIENT_CLEAR_ALL 4

static void set_ambient_cap(int cap)
{
	int rc;

	capng_get_caps_process();
	rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
	if (rc) {
		printf("Cannot add inheritable cap\n");
		exit(2);
	}
	capng_apply(CAPNG_SELECT_CAPS);

	/* Note the two 0s at the end. Kernel checks for these */
	if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
		perror("Cannot set cap");
		exit(1);
	}
}

int main(int argc, char **argv)
{
	int rc;

	set_ambient_cap(CAP_NET_RAW);
	set_ambient_cap(CAP_NET_ADMIN);
	set_ambient_cap(CAP_SYS_NICE);

	printf("Ambient_test forking shell\n");
	if (execv(argv[1], argv + 1))
		perror("Cannot exec");

	return 0;
}

Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

58319057

12 5月, 2015 1 次提交

LSM: Switch to lists of hooks · b1d9e6b0

由 Casey Schaufler 提交于 5月 02, 2015

Instead of using a vector of security operations
with explicit, special case stacking of the capability
and yama hooks use lists of hooks with capability and
yama hooks included as appropriate.

The security_operations structure is no longer required.
Instead, there is a union of the function pointers that
allows all the hooks lists to use a common mechanism for
list management while retaining typing. Each module
supplies an array describing the hooks it provides instead
of a sparsely populated security_operations structure.
The description includes the element that gets put on
the hook list, avoiding the issues surrounding individual
element allocation.

The method for registering security modules is changed to
reflect the information available. The method for removing
a module, currently only used by SELinux, has also changed.
It should be generic now, however if there are potential
race conditions based on ordering of hook removal that needs
to be addressed by the calling module.

The security hooks are called from the lists and the first
failure is returned.
Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
Acked-by: NJohn Johansen <john.johansen@canonical.com>
Acked-by: NKees Cook <keescook@chromium.org>
Acked-by: NPaul Moore <paul@paul-moore.com>
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Acked-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

b1d9e6b0

16 4月, 2015 1 次提交

VFS: security/: d_backing_inode() annotations · c6f493d6

由 David Howells 提交于 3月 17, 2015

most of the ->d_inode uses there refer to the same inode IO would
go to, i.e. d_backing_inode()
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c6f493d6

26 1月, 2015 1 次提交
- A
  file->f_path.dentry is pinned down for as long as the file is open... · f4a4a8b1
  由 Al Viro 提交于 12月 28, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  f4a4a8b1
20 11月, 2014 1 次提交
- A
  kill f_dentry uses · b583043e
  由 Al Viro 提交于 10月 31, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  b583043e
24 7月, 2014 2 次提交

CAPABILITIES: remove undefined caps from all processes · 7d8b6c63

由 Eric Paris 提交于 7月 23, 2014

This is effectively a revert of 7b9a7ec5
plus fixing it a different way...

We found, when trying to run an application from an application which
had dropped privs that the kernel does security checks on undefined
capability bits.  This was ESPECIALLY difficult to debug as those
undefined bits are hidden from /proc/$PID/status.

Consider a root application which drops all capabilities from ALL 4
capability sets.  We assume, since the application is going to set
eff/perm/inh from an array that it will clear not only the defined caps
less than CAP_LAST_CAP, but also the higher 28ish bits which are
undefined future capabilities.

The BSET gets cleared differently.  Instead it is cleared one bit at a
time.  The problem here is that in security/commoncap.c::cap_task_prctl()
we actually check the validity of a capability being read.  So any task
which attempts to 'read all things set in bset' followed by 'unset all
things set in bset' will not even attempt to unset the undefined bits
higher than CAP_LAST_CAP.

So the 'parent' will look something like:
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	ffffffc000000000

All of this 'should' be fine.  Given that these are undefined bits that
aren't supposed to have anything to do with permissions.  But they do...

So lets now consider a task which cleared the eff/perm/inh completely
and cleared all of the valid caps in the bset (but not the invalid caps
it couldn't read out of the kernel).  We know that this is exactly what
the libcap-ng library does and what the go capabilities library does.
They both leave you in that above situation if you try to clear all of
you capapabilities from all 4 sets.  If that root task calls execve()
the child task will pick up all caps not blocked by the bset.  The bset
however does not block bits higher than CAP_LAST_CAP.  So now the child
task has bits in eff which are not in the parent.  These are
'meaningless' undefined bits, but still bits which the parent doesn't
have.

The problem is now in cred_cap_issubset() (or any operation which does a
subset test) as the child, while a subset for valid cap bits, is not a
subset for invalid cap bits!  So now we set durring commit creds that
the child is not dumpable.  Given it is 'more priv' than its parent.  It
also means the parent cannot ptrace the child and other stupidity.

The solution here:
1) stop hiding capability bits in status
	This makes debugging easier!

2) stop giving any task undefined capability bits.  it's simple, it you
don't put those invalid bits in CAP_FULL_SET you won't get them in init
and you won't get them in any other task either.
	This fixes the cap_issubset() tests and resulting fallout (which
	made the init task in a docker container untraceable among other
	things)

3) mask out undefined bits when sys_capset() is called as it might use
~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
	This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.

4) mask out undefined bit when we read a file capability off of disk as
again likely all bits are set in the xattr for forward/backward
compatibility.
	This lets 'setcap all+pe /bin/bash; /bin/bash' run
Signed-off-by: NEric Paris <eparis@redhat.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Dan Walsh <dwalsh@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

7d8b6c63

commoncap: don't alloc the credential unless needed in cap_task_prctl · 6d6f3328

由 Tetsuo Handa 提交于 7月 22, 2014

In function cap_task_prctl(), we would allocate a credential
unconditionally and then check if we support the requested function.
If not we would release this credential with abort_creds() by using
RCU method. But on some archs such as powerpc, the sys_prctl is heavily
used to get/set the floating point exception mode. So the unnecessary
allocating/releasing of credential not only introduce runtime overhead
but also do cause OOM due to the RCU implementation.

This patch removes abort_creds() from cap_task_prctl() by calling
prepare_creds() only when we need to modify it.
Reported-by: NKevin Hao <haokexin@gmail.com>
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: NPaul Moore <paul@paul-moore.com>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

6d6f3328

31 8月, 2013 2 次提交

capabilities: allow nice if we are privileged · f54fb863

由 Serge Hallyn 提交于 7月 23, 2013

We allow task A to change B's nice level if it has a supserset of
B's privileges, or of it has CAP_SYS_NICE.  Also allow it if A has
CAP_SYS_NICE with respect to B - meaning it is root in the same
namespace, or it created B's namespace.
Signed-off-by: NSerge Hallyn <serge.hallyn@canonical.com>
Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

f54fb863

userns: Allow PR_CAPBSET_DROP in a user namespace. · 160da84d

由 Eric W. Biederman 提交于 7月 02, 2013

As the capabilites and capability bounding set are per user namespace
properties it is safe to allow changing them with just CAP_SETPCAP
permission in the user namespace.
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Tested-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

160da84d

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功