1. 28 2月, 2013 1 次提交
  2. 24 2月, 2013 2 次提交
  3. 23 2月, 2013 2 次提交
  4. 28 1月, 2013 1 次提交
  5. 05 1月, 2013 9 次提交
  6. 15 12月, 2012 1 次提交
    • E
      userns: Require CAP_SYS_ADMIN for most uses of setns. · 5e4a0847
      Eric W. Biederman 提交于
      Andy Lutomirski <luto@amacapital.net> found a nasty little bug in
      the permissions of setns.  With unprivileged user namespaces it
      became possible to create new namespaces without privilege.
      
      However the setns calls were relaxed to only require CAP_SYS_ADMIN in
      the user nameapce of the targed namespace.
      
      Which made the following nasty sequence possible.
      
      pid = clone(CLONE_NEWUSER | CLONE_NEWNS);
      if (pid == 0) { /* child */
      	system("mount --bind /home/me/passwd /etc/passwd");
      }
      else if (pid != 0) { /* parent */
      	char path[PATH_MAX];
      	snprintf(path, sizeof(path), "/proc/%u/ns/mnt");
      	fd = open(path, O_RDONLY);
      	setns(fd, 0);
      	system("su -");
      }
      
      Prevent this possibility by requiring CAP_SYS_ADMIN
      in the current user namespace when joing all but the user namespace.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      5e4a0847
  7. 12 12月, 2012 1 次提交
    • A
      mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB · 42d7395f
      Andi Kleen 提交于
      There was some desire in large applications using MAP_HUGETLB or
      SHM_HUGETLB to use 1GB huge pages on some mappings, and stay with 2MB on
      others.  This is useful together with NUMA policy: use 2MB interleaving
      on some mappings, but 1GB on local mappings.
      
      This patch extends the IPC/SHM syscall interfaces slightly to allow
      specifying the page size.
      
      It borrows some upper bits in the existing flag arguments and allows
      encoding the log of the desired page size in addition to the *_HUGETLB
      flag.  When 0 is specified the default size is used, this makes the
      change fully compatible.
      
      Extending the internal hugetlb code to handle this is straight forward.
      Instead of a single mount it just keeps an array of them and selects the
      right mount based on the specified page size.  When no page size is
      specified it uses the mount of the default page size.
      
      The change is not visible in /proc/mounts because internal mounts don't
      appear there.  It also has very little overhead: the additional mounts
      just consume a super block, but not more memory when not used.
      
      I also exported the new flags to the user headers (they were previously
      under __KERNEL__).  Right now only symbols for x86 and some other
      architecture for 1GB and 2MB are defined.  The interface should already
      work for all other architectures though.  Only architectures that define
      multiple hugetlb sizes actually need it (that is currently x86, tile,
      powerpc).  However tile and powerpc have user configurable hugetlb
      sizes, so it's not easy to add defines.  A program on those
      architectures would need to query sysfs and use the appropiate log2.
      
      [akpm@linux-foundation.org: cleanups]
      [rientjes@google.com: fix build]
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      42d7395f
  8. 20 11月, 2012 3 次提交
  9. 13 10月, 2012 2 次提交
    • J
      audit: make audit_inode take struct filename · adb5c247
      Jeff Layton 提交于
      Keep a pointer to the audit_names "slot" in struct filename.
      
      Have all of the audit_inode callers pass a struct filename ponter to
      audit_inode instead of a string pointer. If the aname field is already
      populated, then we can skip walking the list altogether and just use it
      directly.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      adb5c247
    • J
      vfs: define struct filename and have getname() return it · 91a27b2a
      Jeff Layton 提交于
      getname() is intended to copy pathname strings from userspace into a
      kernel buffer. The result is just a string in kernel space. It would
      however be quite helpful to be able to attach some ancillary info to
      the string.
      
      For instance, we could attach some audit-related info to reduce the
      amount of audit-related processing needed. When auditing is enabled,
      we could also call getname() on the string more than once and not
      need to recopy it from userspace.
      
      This patchset converts the getname()/putname() interfaces to return
      a struct instead of a string. For now, the struct just tracks the
      string in kernel space and the original userland pointer for it.
      
      Later, we'll add other information to the struct as it becomes
      convenient.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      91a27b2a
  10. 12 10月, 2012 1 次提交
    • J
      audit: set the name_len in audit_inode for parent lookups · bfcec708
      Jeff Layton 提交于
      Currently, this gets set mostly by happenstance when we call into
      audit_inode_child. While that might be a little more efficient, it seems
      wrong. If the syscall ends up failing before audit_inode_child ever gets
      called, then you'll have an audit_names record that shows the full path
      but has the parent inode info attached.
      
      Fix this by passing in a parent flag when we call audit_inode that gets
      set to the value of LOOKUP_PARENT. We can then fix up the pathname for
      the audit entry correctly from the get-go.
      
      While we're at it, clean up the no-op macro for audit_inode in the
      !CONFIG_AUDITSYSCALL case.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bfcec708
  11. 09 10月, 2012 1 次提交
  12. 27 9月, 2012 2 次提交
  13. 07 9月, 2012 1 次提交
  14. 19 8月, 2012 1 次提交
  15. 31 7月, 2012 4 次提交
  16. 23 7月, 2012 1 次提交
  17. 14 7月, 2012 2 次提交
  18. 08 6月, 2012 1 次提交
  19. 01 6月, 2012 4 次提交
    • A
      switch aio and shm to do_mmap_pgoff(), make do_mmap() static · e3fc629d
      Al Viro 提交于
      after all, 0 bytes and 0 pages is the same thing...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e3fc629d
    • A
      take security_mmap_file() outside of ->mmap_sem · 8b3ec681
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8b3ec681
    • D
      ipc/mqueue: add rbtree node caching support · ce2d52cc
      Doug Ledford 提交于
      When I wrote the first patch that added the rbtree support for message
      queue insertion, it sped up the case where the queue was very full
      drastically from the original code.  It, however, slowed down the case
      where the queue was empty (not drastically though).
      
      This patch caches the last freed rbtree node struct so we can quickly
      reuse it when we get a new message.  This is the common path for any queue
      that very frequently goes from 0 to 1 then back to 0 messages in queue.
      
      Andrew Morton didn't like that we were doing a GFP_ATOMIC allocation in
      msg_insert, so this patch attempts to speculatively allocate a new node
      struct outside of the spin lock when we know we need it, but will still
      fall back to a GFP_ATOMIC allocation if it has to.
      
      Once I added the caching, the necessary various ret = ; spin_unlock
      gyrations in mq_timedsend were getting pretty ugly, so this also slightly
      refactors that function to streamline the flow of the code and the
      function exit.
      
      Finally, while working on getting performance back I made sure that all of
      the node structs were always fully initialized when they were first used,
      rendering the use of kzalloc unnecessary and a waste of CPU cycles.
      
      The net result of all of this is:
      
      1) We will avoid a GFP_ATOMIC allocation when possible, but fall back
         on it when necessary.
      
      2) We will speculatively allocate a node struct using GFP_KERNEL if our
         cache is empty (and save the struct to our cache if it's still empty
         after we have obtained the spin lock).
      
      3) The performance of the common queue empty case has significantly
         improved and is now much more in line with the older performance for
         this case.
      
      The performance changes are:
      
                  Old mqueue      new mqueue      new mqueue + caching
      queue empty
      send/recv   305/288ns       349/318ns       310/322ns
      
      I don't think we'll ever be able to get the recv performance back, but
      that's because the old recv performance was a direct result and
      consequence of the old methods abysmal send performance.  The recv path
      simply must do more so that the send path does not incur such a penalty
      under higher queue depths.
      
      As it turns out, the new caching code also sped up the various queue full
      cases relative to my last patch.  That could be because of the difference
      between the syscall path in 3.3.4-rc5 and 3.3.4-rc6, or because of the
      change in code flow in the mq_timedsend routine.  Regardless, I'll take
      it.  It wasn't huge, and I *would* say it was within the margin for error,
      but after many repeated runs what I'm seeing is that the old numbers trend
      slightly higher (about 10 to 20ns depending on which test is the one
      running).
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ce2d52cc
    • D
      ipc/mqueue: strengthen checks on mqueue creation · 113289cc
      Doug Ledford 提交于
      We already check the mq attr struct if it's passed in, but now that the
      admin can set system wide defaults separate from maximums, it's actually
      possible to set the defaults to something that would overflow.  So, if
      there is no attr struct passed in to the open call, check the default
      values.
      
      While we are at it, simplify mq_attr_ok() by making it return 0 or an
      error condition, so that way if we add more tests to it later, we have the
      option of what error should be returned instead of the calling location
      having to pick a possibly inaccurate error code.
      
      [akpm@linux-foundation.org: s/ENOMEM/EOVERFLOW/]
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      113289cc
新手
引导
客服 返回
顶部