1. 01 5月, 2013 12 次提交
  2. 30 4月, 2013 1 次提交
  3. 03 4月, 2013 1 次提交
  4. 27 3月, 2013 1 次提交
    • E
      ipc: Restrict mounting the mqueue filesystem · a636b702
      Eric W. Biederman 提交于
      Only allow mounting the mqueue filesystem if the caller has CAP_SYS_ADMIN
      rights over the ipc namespace.   The principle here is if you create
      or have capabilities over it you can mount it, otherwise you get to live
      with what other people have mounted.
      
      This information is not particularly sensitive and mqueue essentially
      only reports which posix messages queues exist.  Still when creating a
      restricted environment for an application to live any extra
      information may be of use to someone with sufficient creativity.  The
      historical if imperfect way this information has been restricted has
      been not to allow mounts and restricting this to ipc namespace
      creators maintains the spirit of the historical restriction.
      
      Cc: stable@vger.kernel.org
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      a636b702
  5. 23 3月, 2013 1 次提交
    • V
      mqueue: sys_mq_open: do not call mnt_drop_write() if read-only · 38d78e58
      Vladimir Davydov 提交于
      mnt_drop_write() must be called only if mnt_want_write() succeeded,
      otherwise the mnt_writers counter will diverge.
      
      mnt_writers counters are used to check if remounting FS as read-only is
      OK, so after an extra mnt_drop_write() call, it would be impossible to
      remount mqueue FS as read-only.  Besides, on umount a warning would be
      printed like this one:
      
        =====================================
        [ BUG: bad unlock balance detected! ]
        3.9.0-rc3 #5 Not tainted
        -------------------------------------
        a.out/12486 is trying to release lock (sb_writers) at:
        mnt_drop_write+0x1f/0x30
        but there are no more locks to release!
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      38d78e58
  6. 09 3月, 2013 2 次提交
  7. 06 3月, 2013 1 次提交
  8. 04 3月, 2013 3 次提交
  9. 28 2月, 2013 1 次提交
  10. 24 2月, 2013 2 次提交
  11. 23 2月, 2013 2 次提交
  12. 28 1月, 2013 1 次提交
  13. 05 1月, 2013 9 次提交
  14. 15 12月, 2012 1 次提交
    • E
      userns: Require CAP_SYS_ADMIN for most uses of setns. · 5e4a0847
      Eric W. Biederman 提交于
      Andy Lutomirski <luto@amacapital.net> found a nasty little bug in
      the permissions of setns.  With unprivileged user namespaces it
      became possible to create new namespaces without privilege.
      
      However the setns calls were relaxed to only require CAP_SYS_ADMIN in
      the user nameapce of the targed namespace.
      
      Which made the following nasty sequence possible.
      
      pid = clone(CLONE_NEWUSER | CLONE_NEWNS);
      if (pid == 0) { /* child */
      	system("mount --bind /home/me/passwd /etc/passwd");
      }
      else if (pid != 0) { /* parent */
      	char path[PATH_MAX];
      	snprintf(path, sizeof(path), "/proc/%u/ns/mnt");
      	fd = open(path, O_RDONLY);
      	setns(fd, 0);
      	system("su -");
      }
      
      Prevent this possibility by requiring CAP_SYS_ADMIN
      in the current user namespace when joing all but the user namespace.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      5e4a0847
  15. 12 12月, 2012 1 次提交
    • A
      mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB · 42d7395f
      Andi Kleen 提交于
      There was some desire in large applications using MAP_HUGETLB or
      SHM_HUGETLB to use 1GB huge pages on some mappings, and stay with 2MB on
      others.  This is useful together with NUMA policy: use 2MB interleaving
      on some mappings, but 1GB on local mappings.
      
      This patch extends the IPC/SHM syscall interfaces slightly to allow
      specifying the page size.
      
      It borrows some upper bits in the existing flag arguments and allows
      encoding the log of the desired page size in addition to the *_HUGETLB
      flag.  When 0 is specified the default size is used, this makes the
      change fully compatible.
      
      Extending the internal hugetlb code to handle this is straight forward.
      Instead of a single mount it just keeps an array of them and selects the
      right mount based on the specified page size.  When no page size is
      specified it uses the mount of the default page size.
      
      The change is not visible in /proc/mounts because internal mounts don't
      appear there.  It also has very little overhead: the additional mounts
      just consume a super block, but not more memory when not used.
      
      I also exported the new flags to the user headers (they were previously
      under __KERNEL__).  Right now only symbols for x86 and some other
      architecture for 1GB and 2MB are defined.  The interface should already
      work for all other architectures though.  Only architectures that define
      multiple hugetlb sizes actually need it (that is currently x86, tile,
      powerpc).  However tile and powerpc have user configurable hugetlb
      sizes, so it's not easy to add defines.  A program on those
      architectures would need to query sysfs and use the appropiate log2.
      
      [akpm@linux-foundation.org: cleanups]
      [rientjes@google.com: fix build]
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      42d7395f
  16. 20 11月, 2012 1 次提交
    • E
      proc: Usable inode numbers for the namespace file descriptors. · 98f842e6
      Eric W. Biederman 提交于
      Assign a unique proc inode to each namespace, and use that
      inode number to ensure we only allocate at most one proc
      inode for every namespace in proc.
      
      A single proc inode per namespace allows userspace to test
      to see if two processes are in the same namespace.
      
      This has been a long requested feature and only blocked because
      a naive implementation would put the id in a global space and
      would ultimately require having a namespace for the names of
      namespaces, making migration and certain virtualization tricks
      impossible.
      
      We still don't have per superblock inode numbers for proc, which
      appears necessary for application unaware checkpoint/restart and
      migrations (if the application is using namespace file descriptors)
      but that is now allowd by the design if it becomes important.
      
      I have preallocated the ipc and uts initial proc inode numbers so
      their structures can be statically initialized.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      98f842e6