1. 19 11月, 2009 1 次提交
    • M
      ima: replace GFP_KERNEL with GFP_NOFS · c09c59e6
      Mimi Zohar 提交于
      While running fsstress tests on the NFSv4 mounted ext3 and ext4
      filesystem, the following call trace was generated on the nfs
      server machine.
      
      Replace GFP_KERNEL with GFP_NOFS in ima_iint_insert() to avoid a
      potential deadlock.
      
           =================================
          [ INFO: inconsistent lock state ]
          2.6.31-31.el6.x86_64 #1
          ---------------------------------
          inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
          kswapd2/75 [HC0[0]:SC0[0]:HE1:SE1] takes:
           (jbd2_handle){+.+.?.}, at: [<ffffffff811edd5e>] jbd2_journal_start+0xfe/0x13f
          {RECLAIM_FS-ON-W} state was registered at:
            [<ffffffff81091e40>] mark_held_locks+0x65/0x99
            [<ffffffff81091f31>] lockdep_trace_alloc+0xbd/0xf5
            [<ffffffff81126fdd>] kmem_cache_alloc+0x40/0x185
            [<ffffffff812344d7>] ima_iint_insert+0x3d/0xf1
            [<ffffffff812345b0>] ima_inode_alloc+0x25/0x44
            [<ffffffff811484ac>] inode_init_always+0xec/0x271
            [<ffffffff81148682>] alloc_inode+0x51/0xa1
            [<ffffffff81148700>] new_inode+0x2e/0x94
            [<ffffffff811b2f08>] ext4_new_inode+0xb8/0xdc9
            [<ffffffff811be611>] ext4_create+0xcf/0x175
            [<ffffffff8113e2cd>] vfs_create+0x82/0xb8
            [<ffffffff8113f337>] do_filp_open+0x32c/0x9ee
            [<ffffffff811309b9>] do_sys_open+0x6c/0x12c
            [<ffffffff81130adc>] sys_open+0x2e/0x44
            [<ffffffff81011e42>] system_call_fastpath+0x16/0x1b
            [<ffffffffffffffff>] 0xffffffffffffffff
          irq event stamp: 90371
          hardirqs last  enabled at (90371): [<ffffffff8112708d>]
          kmem_cache_alloc+0xf0/0x185
          hardirqs last disabled at (90370): [<ffffffff81127026>]
          kmem_cache_alloc+0x89/0x185
          softirqs last  enabled at (89492): [<ffffffff81068ecf>]
          __do_softirq+0x1bf/0x1eb
          softirqs last disabled at (89477): [<ffffffff8101312c>] call_softirq+0x1c/0x30
      
          other info that might help us debug this:
          2 locks held by kswapd2/75:
           #0:  (shrinker_rwsem){++++..}, at: [<ffffffff810f98ba>] shrink_slab+0x44/0x177
           #1:  (&type->s_umount_key#25){++++..}, at: [<ffffffff811450ba>]
      Reported-by: NMuni P. Beerakam <mbeeraka@in.ibm.com>
      Reported-by: NAmit K. Arora <amitarora@in.ibm.com>
      Cc: stable@kernel.org
      Signed-off-by: NMimi Zohar <zohar@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      c09c59e6
  2. 19 10月, 2009 1 次提交
    • E
      inet: rename some inet_sock fields · c720c7e8
      Eric Dumazet 提交于
      In order to have better cache layouts of struct sock (separate zones
      for rx/tx paths), we need this preliminary patch.
      
      Goal is to transfert fields used at lookup time in the first
      read-mostly cache line (inside struct sock_common) and move sk_refcnt
      to a separate cache line (only written by rx path)
      
      This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
      sport and id fields. This allows a future patch to define these
      fields as macros, like sk_refcnt, without name clashes.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c720c7e8
  3. 16 10月, 2009 1 次提交
    • D
      KEYS: get_instantiation_keyring() should inc the keyring refcount in all cases · 21279cfa
      David Howells 提交于
      The destination keyring specified to request_key() and co. is made available to
      the process that instantiates the key (the slave process started by
      /sbin/request-key typically).  This is passed in the request_key_auth struct as
      the dest_keyring member.
      
      keyctl_instantiate_key and keyctl_negate_key() call get_instantiation_keyring()
      to get the keyring to attach the newly constructed key to at the end of
      instantiation.  This may be given a specific keyring into which a link will be
      made later, or it may be asked to find the keyring passed to request_key().  In
      the former case, it returns a keyring with the refcount incremented by
      lookup_user_key(); in the latter case, it returns the keyring from the
      request_key_auth struct - and does _not_ increment the refcount.
      
      The latter case will eventually result in an oops when the keyring prematurely
      runs out of references and gets destroyed.  The effect may take some time to
      show up as the key is destroyed lazily.
      
      To fix this, the keyring returned by get_instantiation_keyring() must always
      have its refcount incremented, no matter where it comes from.
      
      This can be tested by setting /etc/request-key.conf to:
      
      #OP	TYPE	DESCRIPTION	CALLOUT INFO	PROGRAM ARG1 ARG2 ARG3 ...
      #======	=======	===============	===============	===============================
      create  *	test:*		*		|/bin/false %u %g %d %{user:_display}
      negate	*	*		*		/bin/keyctl negate %k 10 @u
      
      and then doing:
      
      	keyctl add user _display aaaaaaaa @u
              while keyctl request2 user test:x test:x @u &&
              keyctl list @u;
              do
                      keyctl request2 user test:x test:x @u;
                      sleep 31;
                      keyctl list @u;
              done
      
      which will oops eventually.  Changing the negate line to have @u rather than
      %S at the end is important as that forces the latter case by passing a special
      keyring ID rather than an actual keyring ID.
      Reported-by: NAlexander Zangerl <az@bond.edu.au>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NAlexander Zangerl <az@bond.edu.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21279cfa
  4. 02 10月, 2009 1 次提交
  5. 24 9月, 2009 6 次提交
    • A
      sysctl: remove "struct file *" argument of ->proc_handler · 8d65af78
      Alexey Dobriyan 提交于
      It's unused.
      
      It isn't needed -- read or write flag is already passed and sysctl
      shouldn't care about the rest.
      
      It _was_ used in two places at arch/frv for some reason.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d65af78
    • O
      do_wait() wakeup optimization: change __wake_up_parent() to use filtered wakeup · 0b7570e7
      Oleg Nesterov 提交于
      Ratan Nalumasu reported that in a process with many threads doing
      unnecessary wakeups.  Every waiting thread in the process wakes up to loop
      through the children and see that the only ones it cares about are still
      not ready.
      
      Now that we have struct wait_opts we can change do_wait/__wake_up_parent
      to use filtered wakeups.
      
      We can make child_wait_callback() more clever later, right now it only
      checks eligible_child().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ratan Nalumasu <rnalumasu@gmail.com>
      Cc: Vitaly Mayatskikh <vmayatsk@redhat.com>
      Acked-by: NJames Morris <jmorris@namei.org>
      Tested-by: NValdis Kletnieks <valdis.kletnieks@vt.edu>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b7570e7
    • B
      cgroups: let ss->can_attach and ss->attach do whole threadgroups at a time · be367d09
      Ben Blum 提交于
      Alter the ss->can_attach and ss->attach functions to be able to deal with
      a whole threadgroup at a time, for use in cgroup_attach_proc.  (This is a
      pre-patch to cgroup-procs-writable.patch.)
      
      Currently, new mode of the attach function can only tell the subsystem
      about the old cgroup of the threadgroup leader.  No subsystem currently
      needs that information for each thread that's being moved, but if one were
      to be added (for example, one that counts tasks within a group) this bit
      would need to be reworked a bit to tell the subsystem the right
      information.
      
      [hidave.darkstar@gmail.com: fix build]
      Signed-off-by: NBen Blum <bblum@google.com>
      Signed-off-by: NPaul Menage <menage@google.com>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      Reviewed-by: NMatt Helsley <matthltc@us.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Dave Young <hidave.darkstar@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be367d09
    • P
      lsm: Use a compressed IPv6 string format in audit events · d8116591
      Paul Moore 提交于
      Currently the audit subsystem prints uncompressed IPv6 addresses which not
      only differs from common usage but also results in ridiculously large audit
      strings which is not a good thing.  This patch fixes this by simply converting
      audit to always print compressed IPv6 addresses.
      
      Old message example:
      
       audit(1253576792.161:30): avc:  denied  { ingress } for
        saddr=0000:0000:0000:0000:0000:0000:0000:0001 src=5000
        daddr=0000:0000:0000:0000:0000:0000:0000:0001 dest=35502 netif=lo
        scontext=system_u:object_r:unlabeled_t:s15:c0.c1023
        tcontext=system_u:object_r:lo_netif_t:s0-s15:c0.c1023 tclass=netif
      
      New message example:
      
       audit(1253576792.161:30): avc:  denied  { ingress } for
        saddr=::1 src=5000 daddr=::1 dest=35502 netif=lo
        scontext=system_u:object_r:unlabeled_t:s15:c0.c1023
        tcontext=system_u:object_r:lo_netif_t:s0-s15:c0.c1023 tclass=netif
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d8116591
    • E
      SELinux: do not destroy the avc_cache_nodep · 5224ee08
      Eric Paris 提交于
      The security_ops reset done when SELinux is disabled at run time is done
      after the avc cache is freed and after the kmem_cache for the avc is also
      freed.  This means that between the time the selinux disable code destroys
      the avc_node_cachep another process could make a security request and could
      try to allocate from the cache.  We are just going to leave the cachep around,
      like we always have.
      
      SELinux:  Disabled at runtime.
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffff81122537>] kmem_cache_alloc+0x9a/0x185
      PGD 0
      Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      last sysfs file:
      CPU 1
      Modules linked in:
      Pid: 12, comm: khelper Not tainted 2.6.31-tip-05525-g0eeacc6-dirty #14819
      System Product Name
      RIP: 0010:[<ffffffff81122537>]  [<ffffffff81122537>]
      kmem_cache_alloc+0x9a/0x185
      RSP: 0018:ffff88003f9258b0  EFLAGS: 00010086
      RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000078c0129e
      RDX: 0000000000000000 RSI: ffffffff8130b626 RDI: ffffffff81122528
      RBP: ffff88003f925900 R08: 0000000078c0129e R09: 0000000000000001
      R10: 0000000000000000 R11: 0000000078c0129e R12: 0000000000000246
      R13: 0000000000008020 R14: ffff88003f8586d8 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff880002b00000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: ffffffff827bd420 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process khelper (pid: 12, threadinfo ffff88003f924000, task
      ffff88003f928000)
      Stack:
       0000000000000246 0000802000000246 ffffffff8130b626 0000000000000001
      <0> 0000000078c0129e 0000000000000000 ffff88003f925a70 0000000000000002
      <0> 0000000000000001 0000000000000001 ffff88003f925960 ffffffff8130b626
      Call Trace:
       [<ffffffff8130b626>] ? avc_alloc_node+0x36/0x273
       [<ffffffff8130b626>] avc_alloc_node+0x36/0x273
       [<ffffffff8130b545>] ? avc_latest_notif_update+0x7d/0x9e
       [<ffffffff8130b8b4>] avc_insert+0x51/0x18d
       [<ffffffff8130bcce>] avc_has_perm_noaudit+0x9d/0x128
       [<ffffffff8130bf20>] avc_has_perm+0x45/0x88
       [<ffffffff8130f99d>] current_has_perm+0x52/0x6d
       [<ffffffff8130fbb2>] selinux_task_create+0x2f/0x45
       [<ffffffff81303bf7>] security_task_create+0x29/0x3f
       [<ffffffff8105c6ba>] copy_process+0x82/0xdf0
       [<ffffffff81091578>] ? register_lock_class+0x2f/0x36c
       [<ffffffff81091a13>] ? mark_lock+0x2e/0x1e1
       [<ffffffff8105d596>] do_fork+0x16e/0x382
       [<ffffffff81091578>] ? register_lock_class+0x2f/0x36c
       [<ffffffff810d9166>] ? probe_workqueue_execution+0x57/0xf9
       [<ffffffff81091a13>] ? mark_lock+0x2e/0x1e1
       [<ffffffff810d9166>] ? probe_workqueue_execution+0x57/0xf9
       [<ffffffff8100cdb2>] kernel_thread+0x82/0xe0
       [<ffffffff81078b1f>] ? ____call_usermodehelper+0x0/0x139
       [<ffffffff8100ce10>] ? child_rip+0x0/0x20
       [<ffffffff81078aea>] ? __call_usermodehelper+0x65/0x9a
       [<ffffffff8107a5c7>] run_workqueue+0x171/0x27e
       [<ffffffff8107a573>] ? run_workqueue+0x11d/0x27e
       [<ffffffff81078a85>] ? __call_usermodehelper+0x0/0x9a
       [<ffffffff8107a7bc>] worker_thread+0xe8/0x10f
       [<ffffffff810808e2>] ? autoremove_wake_function+0x0/0x63
       [<ffffffff8107a6d4>] ? worker_thread+0x0/0x10f
       [<ffffffff8108042e>] kthread+0x91/0x99
       [<ffffffff8100ce1a>] child_rip+0xa/0x20
       [<ffffffff8100c754>] ? restore_args+0x0/0x30
       [<ffffffff8108039d>] ? kthread+0x0/0x99
       [<ffffffff8100ce10>] ? child_rip+0x0/0x20
      Code: 0f 85 99 00 00 00 9c 58 66 66 90 66 90 49 89 c4 fa 66 66 90 66 66 90
      e8 83 34 fb ff e8 d7 e9 26 00 48 98 49 8b 94 c6 10 01 00 00 <48> 8b 1a 44
      8b 7a 18 48 85 db 74 0f 8b 42 14 48 8b 04 c3 ff 42
      RIP  [<ffffffff81122537>] kmem_cache_alloc+0x9a/0x185
       RSP <ffff88003f9258b0>
      CR2: 0000000000000000
      ---[ end trace 42f41a982344e606 ]---
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      5224ee08
    • D
      KEYS: Have the garbage collector set its timer for live expired keys · 606531c3
      David Howells 提交于
      The key garbage collector sets a timer to start a new collection cycle at the
      point the earliest key to expire should be considered garbage.  However, it
      currently only does this if the key it is considering hasn't yet expired.
      
      If the key being considering has expired, but hasn't yet reached the collection
      time then it is ignored, and won't be collected until some other key provokes a
      round of collection.
      
      Make the garbage collector set the timer for the earliest key that hasn't yet
      passed its collection time, rather than the earliest key that hasn't yet
      expired.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      606531c3
  6. 23 9月, 2009 2 次提交
  7. 15 9月, 2009 2 次提交
    • D
      KEYS: Fix garbage collector · c08ef808
      David Howells 提交于
      Fix a number of problems with the new key garbage collector:
      
       (1) A rogue semicolon in keyring_gc() was causing the initial count of dead
           keys to be miscalculated.
      
       (2) A missing return in keyring_gc() meant that under certain circumstances,
           the keyring semaphore would be unlocked twice.
      
       (3) The key serial tree iterator (key_garbage_collector()) part of the garbage
           collector has been modified to:
      
           (a) Complete each scan of the keyrings before setting the new timer.
      
           (b) Only set the new timer for keys that have yet to expire.  This means
               that the new timer is now calculated correctly, and the gc doesn't
               get into a loop continually scanning for keys that have expired, and
               preventing other things from happening, like RCU cleaning up the old
               keyring contents.
      
           (c) Perform an extra scan if any keys were garbage collected in this one
           	 as a key might become garbage during a scan, and (b) could mean we
           	 don't set the timer again.
      
       (4) Made key_schedule_gc() take the time at which to do a collection run,
           rather than the time at which the key expires.  This means the collection
           of dead keys (key type unregistered) can happen immediately.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      c08ef808
    • M
      KEYS: Unlock tasklist when exiting early from keyctl_session_to_parent · 5c84342a
      Marc Dionne 提交于
      When we exit early from keyctl_session_to_parent because of permissions or
      because the session keyring is the same as the parent, we need to unlock the
      tasklist.
      
      The missing unlock causes the system to hang completely when using
      keyctl(KEYCTL_SESSION_TO_PARENT) with a keyring shared with the parent.
      Signed-off-by: NMarc Dionne <marc.c.dionne@gmail.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      5c84342a
  8. 14 9月, 2009 3 次提交
  9. 10 9月, 2009 2 次提交
    • D
      sysfs: Add labeling support for sysfs · ddd29ec6
      David P. Quigley 提交于
      This patch adds a setxattr handler to the file, directory, and symlink
      inode_operations structures for sysfs. The patch uses hooks introduced in the
      previous patch to handle the getting and setting of security information for
      the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
      sysfs_dirent structure has been replaced by a structure which contains the
      iattr, secdata and secdata length to allow the changes to persist in the event
      that the inode representing the sysfs_dirent is evicted. Because sysfs only
      stores this information when a change is made all the optional data is moved
      into one dynamically allocated field.
      
      This patch addresses an issue where SELinux was denying virtd access to the PCI
      configuration entries in sysfs. The lack of setxattr handlers for sysfs
      required that a single label be assigned to all entries in sysfs. Granting virtd
      access to every entry in sysfs is not an acceptable solution so fine grained
      labeling of sysfs is required such that individual entries can be labeled
      appropriately.
      
      [sds:  Fixed compile-time warnings, coding style, and setting of inode security init flags.]
      Signed-off-by: NDavid P. Quigley <dpquigl@tycho.nsa.gov>
      Signed-off-by: NStephen D. Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      ddd29ec6
    • D
      LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information. · 1ee65e37
      David P. Quigley 提交于
      This patch introduces three new hooks. The inode_getsecctx hook is used to get
      all relevant information from an LSM about an inode. The inode_setsecctx is
      used to set both the in-core and on-disk state for the inode based on a context
      derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
      LSM of a change for the in-core state of the inode in question. These hooks are
      for use in the labeled NFS code and addresses concerns of how to set security
      on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
      explanation of the reason for these hooks is pasted below.
      
      Quote Stephen Smalley
      
      inode_setsecctx:  Change the security context of an inode.  Updates the
      in core security context managed by the security module and invokes the
      fs code as needed (via __vfs_setxattr_noperm) to update any backing
      xattrs that represent the context.  Example usage:  NFS server invokes
      this hook to change the security context in its incore inode and on the
      backing file system to a value provided by the client on a SETATTR
      operation.
      
      inode_notifysecctx:  Notify the security module of what the security
      context of an inode should be.  Initializes the incore security context
      managed by the security module for this inode.  Example usage:  NFS
      client invokes this hook to initialize the security context in its
      incore inode to the value provided by the server for the file when the
      server returned the file's attributes to the client.
      Signed-off-by: NDavid P. Quigley <dpquigl@tycho.nsa.gov>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      1ee65e37
  10. 07 9月, 2009 1 次提交
  11. 02 9月, 2009 9 次提交
    • D
      KEYS: Add a keyctl to install a process's session keyring on its parent [try #6] · ee18d64c
      David Howells 提交于
      Add a keyctl to install a process's session keyring onto its parent.  This
      replaces the parent's session keyring.  Because the COW credential code does
      not permit one process to change another process's credentials directly, the
      change is deferred until userspace next starts executing again.  Normally this
      will be after a wait*() syscall.
      
      To support this, three new security hooks have been provided:
      cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
      the blank security creds and key_session_to_parent() - which asks the LSM if
      the process may replace its parent's session keyring.
      
      The replacement may only happen if the process has the same ownership details
      as its parent, and the process has LINK permission on the session keyring, and
      the session keyring is owned by the process, and the LSM permits it.
      
      Note that this requires alteration to each architecture's notify_resume path.
      This has been done for all arches barring blackfin, m68k* and xtensa, all of
      which need assembly alteration to support TIF_NOTIFY_RESUME.  This allows the
      replacement to be performed at the point the parent process resumes userspace
      execution.
      
      This allows the userspace AFS pioctl emulation to fully emulate newpag() and
      the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
      alter the parent process's PAG membership.  However, since kAFS doesn't use
      PAGs per se, but rather dumps the keys into the session keyring, the session
      keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
      the newpag flag.
      
      This can be tested with the following program:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <keyutils.h>
      
      	#define KEYCTL_SESSION_TO_PARENT	18
      
      	#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
      
      	int main(int argc, char **argv)
      	{
      		key_serial_t keyring, key;
      		long ret;
      
      		keyring = keyctl_join_session_keyring(argv[1]);
      		OSERROR(keyring, "keyctl_join_session_keyring");
      
      		key = add_key("user", "a", "b", 1, keyring);
      		OSERROR(key, "add_key");
      
      		ret = keyctl(KEYCTL_SESSION_TO_PARENT);
      		OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
      
      		return 0;
      	}
      
      Compiled and linked with -lkeyutils, you should see something like:
      
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: _ses
      	355907932 --alswrv   4043    -1   \_ keyring: _uid.4043
      	[dhowells@andromeda ~]$ /tmp/newpag
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: _ses
      	1055658746 --alswrv   4043  4043   \_ user: a
      	[dhowells@andromeda ~]$ /tmp/newpag hello
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: hello
      	340417692 --alswrv   4043  4043   \_ user: a
      
      Where the test program creates a new session keyring, sticks a user key named
      'a' into it and then installs it on its parent.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      ee18d64c
    • D
      KEYS: Do some whitespace cleanups [try #6] · 7b1b9164
      David Howells 提交于
      Do some whitespace cleanups in the key management code.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      7b1b9164
    • S
      KEYS: Make /proc/keys use keyid not numread as file position [try #6] · ad73a717
      Serge E. Hallyn 提交于
      Make the file position maintained by /proc/keys represent the ID of the key
      just read rather than the number of keys read.  This should make it faster to
      perform a lookup as we don't have to scan the key ID tree from the beginning to
      find the current position.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      ad73a717
    • D
      KEYS: Add garbage collection for dead, revoked and expired keys. [try #6] · 5d135440
      David Howells 提交于
      Add garbage collection for dead, revoked and expired keys.  This involved
      erasing all links to such keys from keyrings that point to them.  At that
      point, the key will be deleted in the normal manner.
      
      Keyrings from which garbage collection occurs are shrunk and their quota
      consumption reduced as appropriate.
      
      Dead keys (for which the key type has been removed) will be garbage collected
      immediately.
      
      Revoked and expired keys will hang around for a number of seconds, as set in
      /proc/sys/kernel/keys/gc_delay before being automatically removed.  The default
      is 5 minutes.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      5d135440
    • D
      KEYS: Flag dead keys to induce EKEYREVOKED [try #6] · f041ae2f
      David Howells 提交于
      Set the KEY_FLAG_DEAD flag on keys for which the type has been removed.  This
      causes the key_permission() function to return EKEYREVOKED in response to
      various commands.  It does not, however, prevent unlinking or clearing of
      keyrings from detaching the key.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      f041ae2f
    • D
      KEYS: Allow keyctl_revoke() on keys that have SETATTR but not WRITE perm [try #6] · 0c2c9a3f
      David Howells 提交于
      Allow keyctl_revoke() to operate on keys that have SETATTR but not WRITE
      permission, rather than only on keys that have WRITE permission.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      0c2c9a3f
    • D
      KEYS: Deal with dead-type keys appropriately [try #6] · 5593122e
      David Howells 提交于
      Allow keys for which the key type has been removed to be unlinked.  Currently
      dead-type keys can only be disposed of by completely clearing the keyrings
      that point to them.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      5593122e
    • D
      CRED: Add some configurable debugging [try #6] · e0e81739
      David Howells 提交于
      Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
      for credential management.  The additional code keeps track of the number of
      pointers from task_structs to any given cred struct, and checks to see that
      this number never exceeds the usage count of the cred struct (which includes
      all references, not just those from task_structs).
      
      Furthermore, if SELinux is enabled, the code also checks that the security
      pointer in the cred struct is never seen to be invalid.
      
      This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
      kernel thread on seeing cred->security be a NULL pointer (it appears that the
      credential struct has been previously released):
      
      	http://www.kerneloops.org/oops.php?number=252883Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      e0e81739
    • S
      x86, intel_txt: clean up the impact on generic code, unbreak non-x86 · 69575d38
      Shane Wang 提交于
      Move tboot.h from asm to linux to fix the build errors of intel_txt
      patch on non-X86 platforms. Remove the tboot code from generic code
      init/main.c and kernel/cpu.c.
      Signed-off-by: NShane Wang <shane.wang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      69575d38
  12. 01 9月, 2009 2 次提交
    • P
      selinux: Support for the new TUN LSM hooks · ed6d76e4
      Paul Moore 提交于
      Add support for the new TUN LSM hooks: security_tun_dev_create(),
      security_tun_dev_post_create() and security_tun_dev_attach().  This includes
      the addition of a new object class, tun_socket, which represents the socks
      associated with TUN devices.  The _tun_dev_create() and _tun_dev_post_create()
      hooks are fairly similar to the standard socket functions but _tun_dev_attach()
      is a bit special.  The _tun_dev_attach() is unique because it involves a
      domain attaching to an existing TUN device and its associated tun_socket
      object, an operation which does not exist with standard sockets and most
      closely resembles a relabel operation.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Acked-by: NEric Paris <eparis@parisplace.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      ed6d76e4
    • P
      lsm: Add hooks to the TUN driver · 2b980dbd
      Paul Moore 提交于
      The TUN driver lacks any LSM hooks which makes it difficult for LSM modules,
      such as SELinux, to enforce access controls on network traffic generated by
      TUN users; this is particularly problematic for virtualization apps such as
      QEMU and KVM.  This patch adds three new LSM hooks designed to control the
      creation and attachment of TUN devices, the hooks are:
      
       * security_tun_dev_create()
         Provides access control for the creation of new TUN devices
      
       * security_tun_dev_post_create()
         Provides the ability to create the necessary socket LSM state for newly
         created TUN devices
      
       * security_tun_dev_attach()
         Provides access control for attaching to existing, persistent TUN devices
         and the ability to update the TUN device's socket LSM state as necessary
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Acked-by: NEric Paris <eparis@parisplace.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      2b980dbd
  13. 27 8月, 2009 1 次提交
    • E
      IMA: iint put in ima_counts_get and put · 53a7197a
      Eric Paris 提交于
      ima_counts_get() calls ima_iint_find_insert_get() which takes a reference
      to the iint in question, but does not put that reference at the end of the
      function.  This can lead to a nasty memory leak.  Easy enough to reproduce:
      
      #include <sys/mman.h>
      #include <stdio.h>
      
      int main (void)
      {
      	int i;
      	void *ptr;
      
      	for (i=0; i < 100000; i++) {
      		ptr = mmap(NULL, 4096, PROT_READ|PROT_WRITE,
      			   MAP_SHARED|MAP_ANONYMOUS, -1, 0);
      		if (ptr == MAP_FAILED)
      			return 2;
      		munmap(ptr, 4096);
      	}
      
      	return 0;
      }
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      53a7197a
  14. 24 8月, 2009 1 次提交
  15. 21 8月, 2009 1 次提交
  16. 19 8月, 2009 2 次提交
  17. 17 8月, 2009 4 次提交
    • E
      Security/SELinux: seperate lsm specific mmap_min_addr · 788084ab
      Eric Paris 提交于
      Currently SELinux enforcement of controls on the ability to map low memory
      is determined by the mmap_min_addr tunable.  This patch causes SELinux to
      ignore the tunable and instead use a seperate Kconfig option specific to how
      much space the LSM should protect.
      
      The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
      permissions will always protect the amount of low memory designated by
      CONFIG_LSM_MMAP_MIN_ADDR.
      
      This allows users who need to disable the mmap_min_addr controls (usual reason
      being they run WINE as a non-root user) to do so and still have SELinux
      controls preventing confined domains (like a web server) from being able to
      map some area of low memory.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      788084ab
    • E
      SELinux: call cap_file_mmap in selinux_file_mmap · 8cf948e7
      Eric Paris 提交于
      Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook.  This
      means there is no DAC check on the ability to mmap low addresses in the
      memory space.  This function adds the DAC check for CAP_SYS_RAWIO while
      maintaining the selinux check on mmap_zero.  This means that processes
      which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
      NOT need the SELinux sys_rawio capability.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      8cf948e7
    • E
      Capabilities: move cap_file_mmap to commoncap.c · 9c0d9010
      Eric Paris 提交于
      Currently we duplicate the mmap_min_addr test in cap_file_mmap and in
      security_file_mmap if !CONFIG_SECURITY.  This patch moves cap_file_mmap
      into commoncap.c and then calls that function directly from
      security_file_mmap ifndef CONFIG_SECURITY like all of the other capability
      checks are done.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      9c0d9010
    • T
      SELinux: Convert avc_audit to use lsm_audit.h · 2bf49690
      Thomas Liu 提交于
      Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
      for better maintainability.
      
       - changed selinux to use common_audit_data instead of
          avc_audit_data
       - eliminated code in avc.c and used code from lsm_audit.h instead.
      
      Had to add a LSM_AUDIT_NO_AUDIT to lsm_audit.h so that avc_audit
      can call common_lsm_audit and do the pre and post callbacks without
      doing the actual dump.  This makes it so that the patched version
      behaves the same way as the unpatched version.
      
      Also added a denied field to the selinux_audit_data private space,
      once again to make it so that the patched version behaves like the
      unpatched.
      
      I've tested and confirmed that AVCs look the same before and after
      this patch.
      Signed-off-by: NThomas Liu <tliu@redhat.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      2bf49690