1. 07 1月, 2014 1 次提交
    • T
      SELinux: Fix memory leak upon loading policy · 8ed81460
      Tetsuo Handa 提交于
      Hello.
      
      I got below leak with linux-3.10.0-54.0.1.el7.x86_64 .
      
      [  681.903890] kmemleak: 5538 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      
      Below is a patch, but I don't know whether we need special handing for undoing
      ebitmap_set_bit() call.
      ----------
      >>From fe97527a90fe95e2239dfbaa7558f0ed559c0992 Mon Sep 17 00:00:00 2001
      From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Date: Mon, 6 Jan 2014 16:30:21 +0900
      Subject: [PATCH] SELinux: Fix memory leak upon loading policy
      
      Commit 2463c26d "SELinux: put name based create rules in a hashtable" did not
      check return value from hashtab_insert() in filename_trans_read(). It leaks
      memory if hashtab_insert() returns error.
      
        unreferenced object 0xffff88005c9160d0 (size 8):
          comm "systemd", pid 1, jiffies 4294688674 (age 235.265s)
          hex dump (first 8 bytes):
            57 0b 00 00 6b 6b 6b a5                          W...kkk.
          backtrace:
            [<ffffffff816604ae>] kmemleak_alloc+0x4e/0xb0
            [<ffffffff811cba5e>] kmem_cache_alloc_trace+0x12e/0x360
            [<ffffffff812aec5d>] policydb_read+0xd1d/0xf70
            [<ffffffff812b345c>] security_load_policy+0x6c/0x500
            [<ffffffff812a623c>] sel_write_load+0xac/0x750
            [<ffffffff811eb680>] vfs_write+0xc0/0x1f0
            [<ffffffff811ec08c>] SyS_write+0x4c/0xa0
            [<ffffffff81690419>] system_call_fastpath+0x16/0x1b
            [<ffffffffffffffff>] 0xffffffffffffffff
      
      However, we should not return EEXIST error to the caller, or the systemd will
      show below message and the boot sequence freezes.
      
        systemd[1]: Failed to load SELinux policy. Freezing.
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: NEric Paris <eparis@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      8ed81460
  2. 17 12月, 2013 2 次提交
  3. 14 12月, 2013 1 次提交
    • P
      selinux: revert 102aefdd · 4d546f81
      Paul Moore 提交于
      Revert "selinux: consider filesystem subtype in policies"
      
      This reverts commit 102aefdd.
      
      Explanation from Eric Paris:
      
      	SELinux policy can specify if it should use a filesystem's
      	xattrs or not.  In current policy we have a specification that
      	fuse should not use xattrs but fuse.glusterfs should use
      	xattrs.  This patch has a bug in which non-glusterfs
      	filesystems would match the rule saying fuse.glusterfs should
      	use xattrs.  If both fuse and the particular filesystem in
      	question are not written to handle xattr calls during the mount
      	command, they will deadlock.
      
      	I have fixed the bug to do proper matching, however I believe a
      	revert is still the correct solution.  The reason I believe
      	that is because the code still does not work.  The s_subtype is
      	not set until after the SELinux hook which attempts to match on
      	the ".gluster" portion of the rule.  So we cannot match on the
      	rule in question.  The code is useless.
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      4d546f81
  4. 12 12月, 2013 1 次提交
  5. 11 12月, 2013 1 次提交
  6. 10 12月, 2013 1 次提交
  7. 05 12月, 2013 4 次提交
    • P
      selinux: pull address family directly from the request_sock struct · 0b1f24e6
      Paul Moore 提交于
      We don't need to inspect the packet to determine if the packet is an
      IPv4 packet arriving on an IPv6 socket when we can query the
      request_sock directly.
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      0b1f24e6
    • P
      selinux: ensure that the cached NetLabel secattr matches the desired SID · 050d032b
      Paul Moore 提交于
      In selinux_netlbl_skbuff_setsid() we leverage a cached NetLabel
      secattr whenever possible.  However, we never check to ensure that
      the desired SID matches the cached NetLabel secattr.  This patch
      checks the SID against the secattr before use and only uses the
      cached secattr when the SID values match.
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      050d032b
    • P
      selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute() · 7f721643
      Paul Moore 提交于
      In selinux_ip_postroute() we perform access checks based on the
      packet's security label.  For locally generated traffic we get the
      packet's security label from the associated socket; this works in all
      cases except for TCP SYN-ACK packets.  In the case of SYN-ACK packet's
      the correct security label is stored in the connection's request_sock,
      not the server's socket.  Unfortunately, at the point in time when
      selinux_ip_postroute() is called we can't query the request_sock
      directly, we need to recreate the label using the same logic that
      originally labeled the associated request_sock.
      
      See the inline comments for more explanation.
      Reported-by: NJanak Desai <Janak.Desai@gtri.gatech.edu>
      Tested-by: NJanak Desai <Janak.Desai@gtri.gatech.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      7f721643
    • P
      selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output() · da2ea0d0
      Paul Moore 提交于
      In selinux_ip_output() we always label packets based on the parent
      socket.  While this approach works in almost all cases, it doesn't
      work in the case of TCP SYN-ACK packets when the correct label is not
      the label of the parent socket, but rather the label of the larval
      socket represented by the request_sock struct.
      
      Unfortunately, since the request_sock isn't queued on the parent
      socket until *after* the SYN-ACK packet is sent, we can't lookup the
      request_sock to determine the correct label for the packet; at this
      point in time the best we can do is simply pass/NF_ACCEPT the packet.
      It must be said that simply passing the packet without any explicit
      labeling action, while far from ideal, is not terrible as the SYN-ACK
      packet will inherit any IP option based labeling from the initial
      connection request so the label *should* be correct and all our
      access controls remain in place so we shouldn't have to worry about
      information leaks.
      Reported-by: NJanak Desai <Janak.Desai@gtri.gatech.edu>
      Tested-by: NJanak Desai <Janak.Desai@gtri.gatech.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      da2ea0d0
  8. 26 11月, 2013 1 次提交
  9. 20 11月, 2013 2 次提交
  10. 05 10月, 2013 3 次提交
  11. 27 9月, 2013 2 次提交
    • P
      selinux: correct locking in selinux_netlbl_socket_connect) · 42d64e1a
      Paul Moore 提交于
      The SELinux/NetLabel glue code has a locking bug that affects systems
      with NetLabel enabled, see the kernel error message below.  This patch
      corrects this problem by converting the bottom half socket lock to a
      more conventional, and correct for this call-path, lock_sock() call.
      
       ===============================
       [ INFO: suspicious RCU usage. ]
       3.11.0-rc3+ #19 Not tainted
       -------------------------------
       net/ipv4/cipso_ipv4.c:1928 suspicious rcu_dereference_protected() usage!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 1, debug_locks = 0
       2 locks held by ping/731:
        #0:  (slock-AF_INET/1){+.-...}, at: [...] selinux_netlbl_socket_connect
        #1:  (rcu_read_lock){.+.+..}, at: [<...>] netlbl_conn_setattr
      
       stack backtrace:
       CPU: 1 PID: 731 Comm: ping Not tainted 3.11.0-rc3+ #19
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        0000000000000001 ffff88006f659d28 ffffffff81726b6a ffff88003732c500
        ffff88006f659d58 ffffffff810e4457 ffff88006b845a00 0000000000000000
        000000000000000c ffff880075aa2f50 ffff88006f659d90 ffffffff8169bec7
       Call Trace:
        [<ffffffff81726b6a>] dump_stack+0x54/0x74
        [<ffffffff810e4457>] lockdep_rcu_suspicious+0xe7/0x120
        [<ffffffff8169bec7>] cipso_v4_sock_setattr+0x187/0x1a0
        [<ffffffff8170f317>] netlbl_conn_setattr+0x187/0x190
        [<ffffffff8170f195>] ? netlbl_conn_setattr+0x5/0x190
        [<ffffffff8131ac9e>] selinux_netlbl_socket_connect+0xae/0xc0
        [<ffffffff81303025>] selinux_socket_connect+0x135/0x170
        [<ffffffff8119d127>] ? might_fault+0x57/0xb0
        [<ffffffff812fb146>] security_socket_connect+0x16/0x20
        [<ffffffff815d3ad3>] SYSC_connect+0x73/0x130
        [<ffffffff81739a85>] ? sysret_check+0x22/0x5d
        [<ffffffff810e5e2d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
        [<ffffffff81373d4e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
        [<ffffffff815d52be>] SyS_connect+0xe/0x10
        [<ffffffff81739a59>] system_call_fastpath+0x16/0x1b
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      42d64e1a
    • D
      7d1db4b2
  12. 29 8月, 2013 2 次提交
    • E
      Revert "SELinux: do not handle seclabel as a special flag" · 0b4bdb35
      Eric Paris 提交于
      This reverts commit 308ab70c.
      
      It breaks my FC6 test box.  /dev/pts is not mounted.  dmesg says
      
      SELinux: mount invalid.  Same superblock, different security settings
      for (dev devpts, type devpts)
      
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      0b4bdb35
    • A
      selinux: consider filesystem subtype in policies · 102aefdd
      Anand Avati 提交于
      Not considering sub filesystem has the following limitation. Support
      for SELinux in FUSE is dependent on the particular userspace
      filesystem, which is identified by the subtype. For e.g, GlusterFS,
      a FUSE based filesystem supports SELinux (by mounting and processing
      FUSE requests in different threads, avoiding the mount time
      deadlock), whereas other FUSE based filesystems (identified by a
      different subtype) have the mount time deadlock.
      
      By considering the subtype of the filesytem in the SELinux policies,
      allows us to specify a filesystem subtype, in the following way:
      
      fs_use_xattr fuse.glusterfs gen_context(system_u:object_r:fs_t,s0);
      
      This way not all FUSE filesystems are put in the same bucket and
      subjected to the limitations of the other subtypes.
      Signed-off-by: NAnand Avati <avati@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      102aefdd
  13. 01 8月, 2013 1 次提交
  14. 26 7月, 2013 18 次提交
    • C
      Add SELinux policy capability for always checking packet and peer classes. · 2be4d74f
      Chris PeBenito 提交于
      Currently the packet class in SELinux is not checked if there are no
      SECMARK rules in the security or mangle netfilter tables.  Some systems
      prefer that packets are always checked, for example, to protect the system
      should the netfilter rules fail to load or if the nefilter rules
      were maliciously flushed.
      
      Add the always_check_network policy capability which, when enabled, treats
      SECMARK as enabled, even if there are no netfilter SECMARK rules and
      treats peer labeling as enabled, even if there is no Netlabel or
      labeled IPSEC configuration.
      
      Includes definition of "redhat1" SELinux policy capability, which
      exists in the SELinux userpace library, to keep ordering correct.
      
      The SELinux userpace portion of this was merged last year, but this kernel
      change fell on the floor.
      Signed-off-by: NChris PeBenito <cpebenito@tresys.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      2be4d74f
    • P
      selinux: fix problems in netnode when BUG() is compiled out · b04eea88
      Paul Moore 提交于
      When the BUG() macro is disabled at compile time it can cause some
      problems in the SELinux netnode code: invalid return codes and
      uninitialized variables.  This patch fixes this by making sure we take
      some corrective action after the BUG() macro.
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      b04eea88
    • E
      SELinux: use a helper function to determine seclabel · b43e725d
      Eric Paris 提交于
      Use a helper to determine if a superblock should have the seclabel flag
      rather than doing it in the function.  I'm going to use this in the
      security server as well.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      b43e725d
    • E
      SELinux: pass a superblock to security_fs_use · a64c54cf
      Eric Paris 提交于
      Rather than passing pointers to memory locations, strings, and other
      stuff just give up on the separation and give security_fs_use the
      superblock.  It just makes the code easier to read (even if not easier to
      reuse on some other OS)
      Signed-off-by: NEric Paris <eparis@redhat.com>
      a64c54cf
    • E
      SELinux: do not handle seclabel as a special flag · 308ab70c
      Eric Paris 提交于
      Instead of having special code around the 'non-mount' seclabel mount option
      just handle it like the mount options.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      308ab70c
    • E
      SELinux: change sbsec->behavior to short · f936c6e5
      Eric Paris 提交于
      We only have 6 options, so char is good enough, but use a short as that
      packs nicely.  This shrinks the superblock_security_struct just a little
      bit.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      f936c6e5
    • E
      SELinux: renumber the superblock options · cfca0303
      Eric Paris 提交于
      Just to make it clear that we have mount time options and flags,
      separate them.  Since I decided to move the non-mount options above
      above 0x10, we need a short instead of a char.  (x86 padding says
      this takes up no additional space as we have a 3byte whole in the
      structure)
      Signed-off-by: NEric Paris <eparis@redhat.com>
      cfca0303
    • E
      SELinux: do all flags twiddling in one place · eadcabc6
      Eric Paris 提交于
      Currently we set the initialize and seclabel flag in one place.  Do some
      unrelated printk then we unset the seclabel flag.  Eww.  Instead do the flag
      twiddling in one place in the code not seperated by unrelated printk.  Also
      don't set and unset the seclabel flag.  Only set it if we need to.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      eadcabc6
    • E
      SELinux: rename SE_SBLABELSUPP to SBLABEL_MNT · 12f348b9
      Eric Paris 提交于
      Just a flag rename as we prepare to make it not so special.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      12f348b9
    • E
      SELinux: use define for number of bits in the mnt flags mask · af8e50cc
      Eric Paris 提交于
      We had this random hard coded value of '8' in the code (I put it there)
      for the number of bits to check for mount options.  This is stupid.  Instead
      use the #define we already have which tells us the number of mount
      options.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      af8e50cc
    • E
      SELinux: make it harder to get the number of mnt opts wrong · d355987f
      Eric Paris 提交于
      Instead of just hard coding a value, use the enum to out benefit.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      d355987f
    • E
      SELinux: remove crazy contortions around proc · 40d3d0b8
      Eric Paris 提交于
      We check if the fsname is proc and if so set the proc superblock security
      struct flag.  We then check if the flag is set and use the string 'proc'
      for the fsname instead of just using the fsname.  What's the point?  It's
      always proc...  Get rid of the useless conditional.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      40d3d0b8
    • E
      SELinux: fix selinuxfs policy file on big endian systems · b138004e
      Eric Paris 提交于
      The /sys/fs/selinux/policy file is not valid on big endian systems like
      ppc64 or s390.  Let's see why:
      
      static int hashtab_cnt(void *key, void *data, void *ptr)
      {
      	int *cnt = ptr;
      	*cnt = *cnt + 1;
      
      	return 0;
      }
      
      static int range_write(struct policydb *p, void *fp)
      {
      	size_t nel;
      [...]
      	/* count the number of entries in the hashtab */
      	nel = 0;
      	rc = hashtab_map(p->range_tr, hashtab_cnt, &nel);
      	if (rc)
      		return rc;
      	buf[0] = cpu_to_le32(nel);
      	rc = put_entry(buf, sizeof(u32), 1, fp);
      
      So size_t is 64 bits.  But then we pass a pointer to it as we do to
      hashtab_cnt.  hashtab_cnt thinks it is a 32 bit int and only deals with
      the first 4 bytes.  On x86_64 which is little endian, those first 4
      bytes and the least significant, so this works out fine.  On ppc64/s390
      those first 4 bytes of memory are the high order bits.  So at the end of
      the call to hashtab_map nel has a HUGE number.  But the least
      significant 32 bits are all 0's.
      
      We then pass that 64 bit number to cpu_to_le32() which happily truncates
      it to a 32 bit number and does endian swapping.  But the low 32 bits are
      all 0's.  So no matter how many entries are in the hashtab, big endian
      systems always say there are 0 entries because I screwed up the
      counting.
      
      The fix is easy.  Use a 32 bit int, as the hashtab_cnt expects, for nel.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      b138004e
    • S
      SELinux: Enable setting security contexts on rootfs inodes. · 5c73fceb
      Stephen Smalley 提交于
      rootfs (ramfs) can support setting of security contexts
      by userspace due to the vfs fallback behavior of calling
      the security module to set the in-core inode state
      for security.* attributes when the filesystem does not
      provide an xattr handler.  No xattr handler required
      as the inodes are pinned in memory and have no backing
      store.
      
      This is useful in allowing early userspace to label individual
      files within a rootfs while still providing a policy-defined
      default via genfs.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      5c73fceb
    • W
      SELinux: Increase ebitmap_node size for 64-bit configuration · a767f680
      Waiman Long 提交于
      Currently, the ebitmap_node structure has a fixed size of 32 bytes. On
      a 32-bit system, the overhead is 8 bytes, leaving 24 bytes for being
      used as bitmaps. The overhead ratio is 1/4.
      
      On a 64-bit system, the overhead is 16 bytes. Therefore, only 16 bytes
      are left for bitmap purpose and the overhead ratio is 1/2. With a
      3.8.2 kernel, a boot-up operation will cause the ebitmap_get_bit()
      function to be called about 9 million times. The average number of
      ebitmap_node traversal is about 3.7.
      
      This patch increases the size of the ebitmap_node structure to 64
      bytes for 64-bit system to keep the overhead ratio at 1/4. This may
      also improve performance a little bit by making node to node traversal
      less frequent (< 2) as more bits are available in each node.
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      a767f680
    • W
      SELinux: Reduce overhead of mls_level_isvalid() function call · fee71142
      Waiman Long 提交于
      While running the high_systime workload of the AIM7 benchmark on
      a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel
      (with HT on), it was found that a pretty sizable amount of time was
      spent in the SELinux code. Below was the perf trace of the "perf
      record -a -s" of a test run at 1500 users:
      
        5.04%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
        1.96%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
        1.95%            ls  [kernel.kallsyms]     [k] find_next_bit
      
      The ebitmap_get_bit() was the hottest function in the perf-report
      output.  Both the ebitmap_get_bit() and find_next_bit() functions
      were, in fact, called by mls_level_isvalid(). As a result, the
      mls_level_isvalid() call consumed 8.95% of the total CPU time of
      all the 24 virtual CPUs which is quite a lot. The majority of the
      mls_level_isvalid() function invocations come from the socket creation
      system call.
      
      Looking at the mls_level_isvalid() function, it is checking to see
      if all the bits set in one of the ebitmap structure are also set in
      another one as well as the highest set bit is no bigger than the one
      specified by the given policydb data structure. It is doing it in
      a bit-by-bit manner. So if the ebitmap structure has many bits set,
      the iteration loop will be done many times.
      
      The current code can be rewritten to use a similar algorithm as the
      ebitmap_contains() function with an additional check for the
      highest set bit. The ebitmap_contains() function was extended to
      cover an optional additional check for the highest set bit, and the
      mls_level_isvalid() function was modified to call ebitmap_contains().
      
      With that change, the perf trace showed that the used CPU time drop
      down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the
      total which is about 100X less than before.
      
        0.07%            ls  [kernel.kallsyms]     [k] ebitmap_contains
        0.05%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
        0.01%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
        0.01%            ls  [kernel.kallsyms]     [k] find_next_bit
      
      The remaining ebitmap_get_bit() and find_next_bit() functions calls
      are made by other kernel routines as the new mls_level_isvalid()
      function will not call them anymore.
      
      This patch also improves the high_systime AIM7 benchmark result,
      though the improvement is not as impressive as is suggested by the
      reduction in CPU time spent in the ebitmap functions. The table below
      shows the performance change on the 2-socket x86-64 system (with HT
      on) mentioned above.
      
      +--------------+---------------+----------------+-----------------+
      |   Workload   | mean % change | mean % change  | mean % change   |
      |              | 10-100 users  | 200-1000 users | 1100-2000 users |
      +--------------+---------------+----------------+-----------------+
      | high_systime |     +0.1%     |     +0.9%      |     +2.6%       |
      +--------------+---------------+----------------+-----------------+
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      fee71142
    • P
      selinux: remove the BUG_ON() from selinux_skb_xfrm_sid() · bed4d7ef
      Paul Moore 提交于
      Remove the BUG_ON() from selinux_skb_xfrm_sid() and propogate the
      error code up to the caller.  Also check the return values in the
      only caller function, selinux_skb_peerlbl_sid().
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      bed4d7ef
    • P
      selinux: cleanup the XFRM header · d1b17b09
      Paul Moore 提交于
      Remove the unused get_sock_isec() function and do some formatting
      fixes.
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      d1b17b09