1. 07 1月, 2011 1 次提交
    • N
      fs: rcu-walk for path lookup · 31e6b01f
      Nick Piggin 提交于
      Perform common cases of path lookups without any stores or locking in the
      ancestor dentry elements. This is called rcu-walk, as opposed to the current
      algorithm which is a refcount based walk, or ref-walk.
      
      This results in far fewer atomic operations on every path element,
      significantly improving path lookup performance. It also avoids cacheline
      bouncing on common dentries, significantly improving scalability.
      
      The overall design is like this:
      * LOOKUP_RCU is set in nd->flags, which distinguishes rcu-walk from ref-walk.
      * Take the RCU lock for the entire path walk, starting with the acquiring
        of the starting path (eg. root/cwd/fd-path). So now dentry refcounts are
        not required for dentry persistence.
      * synchronize_rcu is called when unregistering a filesystem, so we can
        access d_ops and i_ops during rcu-walk.
      * Similarly take the vfsmount lock for the entire path walk. So now mnt
        refcounts are not required for persistence. Also we are free to perform mount
        lookups, and to assume dentry mount points and mount roots are stable up and
        down the path.
      * Have a per-dentry seqlock to protect the dentry name, parent, and inode,
        so we can load this tuple atomically, and also check whether any of its
        members have changed.
      * Dentry lookups (based on parent, candidate string tuple) recheck the parent
        sequence after the child is found in case anything changed in the parent
        during the path walk.
      * inode is also RCU protected so we can load d_inode and use the inode for
        limited things.
      * i_mode, i_uid, i_gid can be tested for exec permissions during path walk.
      * i_op can be loaded.
      
      When we reach the destination dentry, we lock it, recheck lookup sequence,
      and increment its refcount and mountpoint refcount. RCU and vfsmount locks
      are dropped. This is termed "dropping rcu-walk". If the dentry refcount does
      not match, we can not drop rcu-walk gracefully at the current point in the
      lokup, so instead return -ECHILD (for want of a better errno). This signals the
      path walking code to re-do the entire lookup with a ref-walk.
      
      Aside from the final dentry, there are other situations that may be encounted
      where we cannot continue rcu-walk. In that case, we drop rcu-walk (ie. take
      a reference on the last good dentry) and continue with a ref-walk. Again, if
      we can drop rcu-walk gracefully, we return -ECHILD and do the whole lookup
      using ref-walk. But it is very important that we can continue with ref-walk
      for most cases, particularly to avoid the overhead of double lookups, and to
      gain the scalability advantages on common path elements (like cwd and root).
      
      The cases where rcu-walk cannot continue are:
      * NULL dentry (ie. any uncached path element)
      * parent with d_inode->i_op->permission or ACLs
      * dentries with d_revalidate
      * Following links
      
      In future patches, permission checks and d_revalidate become rcu-walk aware. It
      may be possible eventually to make following links rcu-walk aware.
      
      Uncached path elements will always require dropping to ref-walk mode, at the
      very least because i_mutex needs to be grabbed, and objects allocated.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      31e6b01f
  2. 06 1月, 2011 1 次提交
  3. 16 11月, 2010 1 次提交
  4. 27 10月, 2010 1 次提交
  5. 21 10月, 2010 3 次提交
  6. 02 8月, 2010 1 次提交
  7. 28 7月, 2010 1 次提交
  8. 16 7月, 2010 1 次提交
  9. 17 5月, 2010 1 次提交
  10. 12 4月, 2010 13 次提交
  11. 03 3月, 2010 1 次提交
    • W
      Security: Add __init to register_security to disable load a security module on runtime · c1e992b9
      wzt.wzt@gmail.com 提交于
      LSM framework doesn't allow to load a security module on runtime, it must be loaded on boot time.
      but in security/security.c:
      int register_security(struct security_operations *ops)
      {
              ...
              if (security_ops != &default_security_ops)
                      return -EAGAIN;
              ...
      }
      if security_ops == &default_security_ops, it can access to register a security module. If selinux is enabled,
      other security modules can't register, but if selinux is disabled on boot time, the security_ops was set to
      default_security_ops, LSM allows other kernel modules to use register_security() to register a not trust
      security module. For example:
      
      disable selinux on boot time(selinux=0).
      
      #include <linux/kernel.h>
      #include <linux/module.h>
      #include <linux/init.h>
      #include <linux/version.h>
      #include <linux/string.h>
      #include <linux/list.h>
      #include <linux/security.h>
      
      MODULE_LICENSE("GPL");
      MODULE_AUTHOR("wzt");
      
      extern int register_security(struct security_operations *ops);
      int (*new_register_security)(struct security_operations *ops);
      
      int rootkit_bprm_check_security(struct linux_binprm *bprm)
      {
              return 0;
      }
      
      struct security_operations rootkit_ops = {
                      .bprm_check_security = rootkit_bprm_check_security,
      };
      
      static int rootkit_init(void)
      {
              printk("Load LSM rootkit module.\n");
      
      	/* cat /proc/kallsyms | grep register_security */
              new_register_security = 0xc0756689;
              if (new_register_security(&rootkit_ops)) {
                      printk("Can't register rootkit module.\n");
                      return 0;
              }
              printk("Register rootkit module ok.\n");
      
              return 0;
      }
      
      static void rootkit_exit(void)
      {
              printk("Unload LSM rootkit module.\n");
      }
      
      module_init(rootkit_init);
      module_exit(rootkit_exit);
      Signed-off-by: NZhitong Wang <zhitong.wangzt@alibaba-inc.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      c1e992b9
  12. 24 2月, 2010 1 次提交
    • W
      Security: add static to security_ops and default_security_ops variable · 189b3b1c
      wzt.wzt@gmail.com 提交于
      Enhance the security framework to support resetting the active security
      module. This eliminates the need for direct use of the security_ops and
      default_security_ops variables outside of security.c, so make security_ops
      and default_security_ops static. Also remove the secondary_ops variable as
      a cleanup since there is no use for that. secondary_ops was originally used by
      SELinux to call the "secondary" security module (capability or dummy),
      but that was replaced by direct calls to capability and the only
      remaining use is to save and restore the original security ops pointer
      value if SELinux is disabled by early userspace based on /etc/selinux/config.
      Further, if we support this directly in the security framework, then we can
      just use &default_security_ops for this purpose since that is now available.
      Signed-off-by: NZhitong Wang <zhitong.wangzt@alibaba-inc.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      189b3b1c
  13. 07 2月, 2010 1 次提交
  14. 04 2月, 2010 1 次提交
  15. 15 1月, 2010 1 次提交
  16. 08 12月, 2009 1 次提交
  17. 10 11月, 2009 1 次提交
    • E
      security: report the module name to security_module_request · dd8dbf2e
      Eric Paris 提交于
      For SELinux to do better filtering in userspace we send the name of the
      module along with the AVC denial when a program is denied module_request.
      
      Example output:
      
      type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null)
      type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc:  denied  { module_request } for  pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      dd8dbf2e
  18. 09 11月, 2009 1 次提交
  19. 25 10月, 2009 1 次提交
  20. 12 10月, 2009 2 次提交
  21. 10 9月, 2009 1 次提交
    • D
      LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information. · 1ee65e37
      David P. Quigley 提交于
      This patch introduces three new hooks. The inode_getsecctx hook is used to get
      all relevant information from an LSM about an inode. The inode_setsecctx is
      used to set both the in-core and on-disk state for the inode based on a context
      derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
      LSM of a change for the in-core state of the inode in question. These hooks are
      for use in the labeled NFS code and addresses concerns of how to set security
      on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
      explanation of the reason for these hooks is pasted below.
      
      Quote Stephen Smalley
      
      inode_setsecctx:  Change the security context of an inode.  Updates the
      in core security context managed by the security module and invokes the
      fs code as needed (via __vfs_setxattr_noperm) to update any backing
      xattrs that represent the context.  Example usage:  NFS server invokes
      this hook to change the security context in its incore inode and on the
      backing file system to a value provided by the client on a SETATTR
      operation.
      
      inode_notifysecctx:  Notify the security module of what the security
      context of an inode should be.  Initializes the incore security context
      managed by the security module for this inode.  Example usage:  NFS
      client invokes this hook to initialize the security context in its
      incore inode to the value provided by the server for the file when the
      server returned the file's attributes to the client.
      Signed-off-by: NDavid P. Quigley <dpquigl@tycho.nsa.gov>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      1ee65e37
  22. 02 9月, 2009 1 次提交
    • D
      KEYS: Add a keyctl to install a process's session keyring on its parent [try #6] · ee18d64c
      David Howells 提交于
      Add a keyctl to install a process's session keyring onto its parent.  This
      replaces the parent's session keyring.  Because the COW credential code does
      not permit one process to change another process's credentials directly, the
      change is deferred until userspace next starts executing again.  Normally this
      will be after a wait*() syscall.
      
      To support this, three new security hooks have been provided:
      cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
      the blank security creds and key_session_to_parent() - which asks the LSM if
      the process may replace its parent's session keyring.
      
      The replacement may only happen if the process has the same ownership details
      as its parent, and the process has LINK permission on the session keyring, and
      the session keyring is owned by the process, and the LSM permits it.
      
      Note that this requires alteration to each architecture's notify_resume path.
      This has been done for all arches barring blackfin, m68k* and xtensa, all of
      which need assembly alteration to support TIF_NOTIFY_RESUME.  This allows the
      replacement to be performed at the point the parent process resumes userspace
      execution.
      
      This allows the userspace AFS pioctl emulation to fully emulate newpag() and
      the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
      alter the parent process's PAG membership.  However, since kAFS doesn't use
      PAGs per se, but rather dumps the keys into the session keyring, the session
      keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
      the newpag flag.
      
      This can be tested with the following program:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <keyutils.h>
      
      	#define KEYCTL_SESSION_TO_PARENT	18
      
      	#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
      
      	int main(int argc, char **argv)
      	{
      		key_serial_t keyring, key;
      		long ret;
      
      		keyring = keyctl_join_session_keyring(argv[1]);
      		OSERROR(keyring, "keyctl_join_session_keyring");
      
      		key = add_key("user", "a", "b", 1, keyring);
      		OSERROR(key, "add_key");
      
      		ret = keyctl(KEYCTL_SESSION_TO_PARENT);
      		OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
      
      		return 0;
      	}
      
      Compiled and linked with -lkeyutils, you should see something like:
      
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: _ses
      	355907932 --alswrv   4043    -1   \_ keyring: _uid.4043
      	[dhowells@andromeda ~]$ /tmp/newpag
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: _ses
      	1055658746 --alswrv   4043  4043   \_ user: a
      	[dhowells@andromeda ~]$ /tmp/newpag hello
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: hello
      	340417692 --alswrv   4043  4043   \_ user: a
      
      Where the test program creates a new session keyring, sticks a user key named
      'a' into it and then installs it on its parent.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      ee18d64c
  23. 01 9月, 2009 1 次提交
    • P
      lsm: Add hooks to the TUN driver · 2b980dbd
      Paul Moore 提交于
      The TUN driver lacks any LSM hooks which makes it difficult for LSM modules,
      such as SELinux, to enforce access controls on network traffic generated by
      TUN users; this is particularly problematic for virtualization apps such as
      QEMU and KVM.  This patch adds three new LSM hooks designed to control the
      creation and attachment of TUN devices, the hooks are:
      
       * security_tun_dev_create()
         Provides access control for the creation of new TUN devices
      
       * security_tun_dev_post_create()
         Provides the ability to create the necessary socket LSM state for newly
         created TUN devices
      
       * security_tun_dev_attach()
         Provides access control for attaching to existing, persistent TUN devices
         and the ability to update the TUN device's socket LSM state as necessary
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Acked-by: NEric Paris <eparis@parisplace.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      2b980dbd
  24. 14 8月, 2009 1 次提交
  25. 24 6月, 2009 1 次提交