1. 27 3月, 2013 2 次提交
    • E
      userns: Restrict when proc and sysfs can be mounted · 87a8ebd6
      Eric W. Biederman 提交于
      Only allow unprivileged mounts of proc and sysfs if they are already
      mounted when the user namespace is created.
      
      proc and sysfs are interesting because they have content that is
      per namespace, and so fresh mounts are needed when new namespaces
      are created while at the same time proc and sysfs have content that
      is shared between every instance.
      
      Respect the policy of who may see the shared content of proc and sysfs
      by only allowing new mounts if there was an existing mount at the time
      the user namespace was created.
      
      In practice there are only two interesting cases: proc and sysfs are
      mounted at their usual places, proc and sysfs are not mounted at all
      (some form of mount namespace jail).
      
      Cc: stable@vger.kernel.org
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      87a8ebd6
    • E
      userns: Don't allow creation if the user is chrooted · 3151527e
      Eric W. Biederman 提交于
      Guarantee that the policy of which files may be access that is
      established by setting the root directory will not be violated
      by user namespaces by verifying that the root directory points
      to the root of the mount namespace at the time of user namespace
      creation.
      
      Changing the root is a privileged operation, and as a matter of policy
      it serves to limit unprivileged processes to files below the current
      root directory.
      
      For reasons of simplicity and comprehensibility the privilege to
      change the root directory is gated solely on the CAP_SYS_CHROOT
      capability in the user namespace.  Therefore when creating a user
      namespace we must ensure that the policy of which files may be access
      can not be violated by changing the root directory.
      
      Anyone who runs a processes in a chroot and would like to use user
      namespace can setup the same view of filesystems with a mount
      namespace instead.  With this result that this is not a practical
      limitation for using user namespaces.
      
      Cc: stable@vger.kernel.org
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      3151527e
  2. 26 3月, 2013 1 次提交
  3. 14 3月, 2013 4 次提交
    • T
      workqueue: convert to idr_alloc() · e68035fb
      Tejun Heo 提交于
      idr_get_new*() and friends are about to be deprecated.  Convert to the
      new idr_alloc() interface.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e68035fb
    • A
      kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER · 522cff14
      Andrew Morton 提交于
      __ARCH_HAS_SA_RESTORER is the preferred conditional for use in 3.9 and
      later kernels, per Kees.
      
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: Julien Tinnes <jln@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      522cff14
    • K
      signal: always clear sa_restorer on execve · 2ca39528
      Kees Cook 提交于
      When the new signal handlers are set up, the location of sa_restorer is
      not cleared, leaking a parent process's address space location to
      children.  This allows for a potential bypass of the parent's ASLR by
      examining the sa_restorer value returned when calling sigaction().
      
      Based on what should be considered "secret" about addresses, it only
      matters across the exec not the fork (since the VMAs haven't changed
      until the exec).  But since exec sets SIG_DFL and keeps sa_restorer,
      this is where it should be fixed.
      
      Given the few uses of sa_restorer, a "set" function was not written
      since this would be the only use.  Instead, we use
      __ARCH_HAS_SA_RESTORER, as already done in other places.
      
      Example of the leak before applying this patch:
      
        $ cat /proc/$$/maps
        ...
        7fb9f3083000-7fb9f3238000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
        ...
        $ ./leak
        ...
        7f278bc74000-7f278be29000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
        ...
        1 0 (nil) 0x7fb9f30b94a0
        2 4000000 (nil) 0x7f278bcaa4a0
        3 4000000 (nil) 0x7f278bcaa4a0
        4 0 (nil) 0x7fb9f30b94a0
        ...
      
      [akpm@linux-foundation.org: use SA_RESTORER for backportability]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reported-by: NEmese Revfy <re.emese@gmail.com>
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: Julien Tinnes <jln@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ca39528
    • E
      userns: Don't allow CLONE_NEWUSER | CLONE_FS · e66eded8
      Eric W. Biederman 提交于
      Don't allowing sharing the root directory with processes in a
      different user namespace.  There doesn't seem to be any point, and to
      allow it would require the overhead of putting a user namespace
      reference in fs_struct (for permission checks) and incrementing that
      reference count on practically every call to fork.
      
      So just perform the inexpensive test of forbidding sharing fs_struct
      acrosss processes in different user namespaces.  We already disallow
      other forms of threading when unsharing a user namespace so this
      should be no real burden in practice.
      
      This updates setns, clone, and unshare to disallow multiple user
      namespaces sharing an fs_struct.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e66eded8
  4. 13 3月, 2013 2 次提交
  5. 09 3月, 2013 1 次提交
  6. 07 3月, 2013 2 次提交
    • S
      tracing: Do not return EINVAL in snapshot when not allocated · c9960e48
      Steven Rostedt (Red Hat) 提交于
      To use the tracing snapshot feature, writing a '1' into the snapshot
      file causes the snapshot buffer to be allocated if it has not already
      been allocated and dose a 'swap' with the main buffer, so that the
      snapshot now contains what was in the main buffer, and the main buffer
      now writes to what was the snapshot buffer.
      
      To free the snapshot buffer, a '0' is written into the snapshot file.
      
      To clear the snapshot buffer, any number but a '0' or '1' is written
      into the snapshot file. But if the file is not allocated it returns
      -EINVAL error code. This is rather pointless. It is better just to
      do nothing and return success.
      Acked-by: NHiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c9960e48
    • S
      tracing: Add help of snapshot feature when snapshot is empty · d8741e2e
      Steven Rostedt (Red Hat) 提交于
      When cat'ing the snapshot file, instead of showing an empty trace
      header like the trace file does, show how to use the snapshot
      feature.
      
      Also, this is a good place to show if the snapshot has been allocated
      or not. Users may want to "pre allocate" the snapshot to have a fast
      "swap" of the current buffer. Otherwise, a swap would be slow and might
      fail as it would need to allocate the snapshot buffer, and that might
      fail under tight memory constraints.
      
      Here's what it looked like before:
      
       # tracer: nop
       #
       # entries-in-buffer/entries-written: 0/0   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
      
      Here's what it looks like now:
      
       # tracer: nop
       #
       #
       # * Snapshot is freed *
       #
       # Snapshot commands:
       # echo 0 > snapshot : Clears and frees snapshot buffer
       # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
       #                      Takes a snapshot of the main buffer.
       # echo 2 > snapshot : Clears snapshot buffer (but does not allocate)
       #                      (Doesn't have to be '2' works with any number that
       #                       is not a '0' or '1')
      Acked-by: NHiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d8741e2e
  7. 03 3月, 2013 2 次提交
    • A
      fix compat_sys_rt_sigprocmask() · db61ec29
      Al Viro 提交于
      Converting bitmask to 32bit granularity is fine, but we'd better
      _do_ something with the result.  Such as "copy it to userland"...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      db61ec29
    • J
      trace/ring_buffer: handle 64bit aligned structs · 649508f6
      James Hogan 提交于
      Some 32 bit architectures require 64 bit values to be aligned (for
      example Meta which has 64 bit read/write instructions). These require 8
      byte alignment of event data too, so use
      !CONFIG_HAVE_64BIT_ALIGNED_ACCESS instead of !CONFIG_64BIT ||
      CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to decide alignment, and align
      buffer_data_page::data accordingly.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Acked-by: Steven Rostedt <rostedt@goodmis.org> (previous version subtly different)
      649508f6
  8. 02 3月, 2013 8 次提交
  9. 01 3月, 2013 1 次提交
    • F
      irq: Don't re-enable interrupts at the end of irq_exit · 4cd5d111
      Frederic Weisbecker 提交于
      Commit 74eed016
      "irq: Ensure irq_exit() code runs with interrupts disabled"
      restore interrupts flags in the end of irq_exit() for archs
      that don't define __ARCH_IRQ_EXIT_IRQS_DISABLED.
      
      However always returning from irq_exit() with interrupts
      disabled should not be a problem for these archs. Prior to
      this commit this was already happening anytime we processed
      pending softirqs anyway.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      4cd5d111
  10. 28 2月, 2013 17 次提交