1. 31 7月, 2012 1 次提交
  2. 21 5月, 2012 1 次提交
  3. 08 5月, 2012 2 次提交
  4. 07 5月, 2012 1 次提交
    • S
      tracing: Provide trace events interface for uprobes · f3f096cf
      Srikar Dronamraju 提交于
      Implements trace_event support for uprobes. In its current form
      it can be used to put probes at a specified offset in a file and
      dump the required registers when the code flow reaches the
      probed address.
      
      The following example shows how to dump the instruction pointer
      and %ax a register at the probed text address.  Here we are
      trying to probe zfree in /bin/zsh:
      
       # cd /sys/kernel/debug/tracing/
       # cat /proc/`pgrep  zsh`/maps | grep /bin/zsh | grep r-xp
       00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
       # objdump -T /bin/zsh | grep -w zfree
       0000000000446420 g    DF .text  0000000000000012  Base
       zfree # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
       # cat uprobe_events
       p:uprobes/p_zsh_0x46420 /bin/zsh:0x0000000000046420
       # echo 1 > events/uprobes/enable
       # sleep 20
       # echo 0 > events/uprobes/enable
       # cat trace
       # tracer: nop
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
                    zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120411103043.GB29437@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f3f096cf
  5. 05 5月, 2012 2 次提交
    • T
      init_task: Replace CONFIG_HAVE_GENERIC_INIT_TASK · a6359d1e
      Thomas Gleixner 提交于
      Now that all archs except ia64 are converted, replace the config and
      let the ia64 select CONFIG_ARCH_INIT_TASK
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20120503085035.867948914@linutronix.de
      a6359d1e
    • T
      init_task: Create generic init_task instance · a4a2eb49
      Thomas Gleixner 提交于
      All archs define init_task in the same way (except ia64, but there is
      no particular reason why ia64 cannot use the common version). Create a
      generic instance so all archs can be converted over.
      
      The config switch is temporary and will be removed when all archs are
      converted over.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20120503085034.092585287@linutronix.de
      a4a2eb49
  6. 26 4月, 2012 1 次提交
    • T
      smp: Provide generic idle thread allocation · 29d5e047
      Thomas Gleixner 提交于
      All SMP architectures have magic to fork the idle task and to store it
      for reusage when cpu hotplug is enabled. Provide a generic
      infrastructure for it.
      
      Create/reinit the idle thread for the cpu which is brought up in the
      generic code and hand the thread pointer to the architecture code via
      __cpu_up().
      
      Note, that fork_idle() is called via a workqueue, because this
      guarantees that the idle thread does not get a reference to a user
      space VM. This can happen when the boot process did not bring up all
      possible cpus and a later cpu_up() is initiated via the sysfs
      interface. In that case fork_idle() would be called in the context of
      the user space task and take a reference on the user space VM.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: x86@kernel.org
      Acked-by: NVenkatesh Pallipadi <venki@google.com>
      Link: http://lkml.kernel.org/r/20120420124557.102478630@linutronix.de
      29d5e047
  7. 14 4月, 2012 4 次提交
    • W
      ptrace,seccomp: Add PTRACE_SECCOMP support · fb0fadf9
      Will Drewry 提交于
      This change adds support for a new ptrace option, PTRACE_O_TRACESECCOMP,
      and a new return value for seccomp BPF programs, SECCOMP_RET_TRACE.
      
      When a tracer specifies the PTRACE_O_TRACESECCOMP ptrace option, the
      tracer will be notified, via PTRACE_EVENT_SECCOMP, for any syscall that
      results in a BPF program returning SECCOMP_RET_TRACE.  The 16-bit
      SECCOMP_RET_DATA mask of the BPF program return value will be passed as
      the ptrace_message and may be retrieved using PTRACE_GETEVENTMSG.
      
      If the subordinate process is not using seccomp filter, then no
      system call notifications will occur even if the option is specified.
      
      If there is no tracer with PTRACE_O_TRACESECCOMP when SECCOMP_RET_TRACE
      is returned, the system call will not be executed and an -ENOSYS errno
      will be returned to userspace.
      
      This change adds a dependency on the system call slow path.  Any future
      efforts to use the system call fast path for seccomp filter will need to
      address this restriction.
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Acked-by: NEric Paris <eparis@redhat.com>
      
      v18: - rebase
           - comment fatal_signal check
           - acked-by
           - drop secure_computing_int comment
      v17: - ...
      v16: - update PT_TRACE_MASK to 0xbf4 so that STOP isn't clear on SETOPTIONS call (indan@nul.nu)
             [note PT_TRACE_MASK disappears in linux-next]
      v15: - add audit support for non-zero return codes
           - clean up style (indan@nul.nu)
      v14: - rebase/nochanges
      v13: - rebase on to 88ebdda6
             (Brings back a change to ptrace.c and the masks.)
      v12: - rebase to linux-next
           - use ptrace_event and update arch/Kconfig to mention slow-path dependency
           - drop all tracehook changes and inclusion (oleg@redhat.com)
      v11: - invert the logic to just make it a PTRACE_SYSCALL accelerator
             (indan@nul.nu)
      v10: - moved to PTRACE_O_SECCOMP / PT_TRACE_SECCOMP
      v9:  - n/a
      v8:  - guarded PTRACE_SECCOMP use with an ifdef
      v7:  - introduced
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      fb0fadf9
    • W
      seccomp: Add SECCOMP_RET_TRAP · bb6ea430
      Will Drewry 提交于
      Adds a new return value to seccomp filters that triggers a SIGSYS to be
      delivered with the new SYS_SECCOMP si_code.
      
      This allows in-process system call emulation, including just specifying
      an errno or cleanly dumping core, rather than just dying.
      Suggested-by: NMarkus Gutschke <markus@chromium.org>
      Suggested-by: NJulien Tinnes <jln@chromium.org>
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Acked-by: NEric Paris <eparis@redhat.com>
      
      v18: - acked-by, rebase
           - don't mention secure_computing_int() anymore
      v15: - use audit_seccomp/skip
           - pad out error spacing; clean up switch (indan@nul.nu)
      v14: - n/a
      v13: - rebase on to 88ebdda6
      v12: - rebase on to linux-next
      v11: - clarify the comment (indan@nul.nu)
           - s/sigtrap/sigsys
      v10: - use SIGSYS, syscall_get_arch, updates arch/Kconfig
             note suggested-by (though original suggestion had other behaviors)
      v9:  - changes to SIGILL
      v8:  - clean up based on changes to dependent patches
      v7:  - introduction
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      bb6ea430
    • W
      seccomp: add SECCOMP_RET_ERRNO · acf3b2c7
      Will Drewry 提交于
      This change adds the SECCOMP_RET_ERRNO as a valid return value from a
      seccomp filter.  Additionally, it makes the first use of the lower
      16-bits for storing a filter-supplied errno.  16-bits is more than
      enough for the errno-base.h calls.
      
      Returning errors instead of immediately terminating processes that
      violate seccomp policy allow for broader use of this functionality
      for kernel attack surface reduction.  For example, a linux container
      could maintain a whitelist of pre-existing system calls but drop
      all new ones with errnos.  This would keep a logically static attack
      surface while providing errnos that may allow for graceful failure
      without the downside of do_exit() on a bad call.
      
      This change also changes the signature of __secure_computing.  It
      appears the only direct caller is the arm entry code and it clobbers
      any possible return value (register) immediately.
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Acked-by: NEric Paris <eparis@redhat.com>
      
      v18: - fix up comments and rebase
           - fix bad var name which was fixed in later revs
           - remove _int() and just change the __secure_computing signature
      v16-v17: ...
      v15: - use audit_seccomp and add a skip label. (eparis@redhat.com)
           - clean up and pad out return codes (indan@nul.nu)
      v14: - no change/rebase
      v13: - rebase on to 88ebdda6
      v12: - move to WARN_ON if filter is NULL
             (oleg@redhat.com, luto@mit.edu, keescook@chromium.org)
           - return immediately for filter==NULL (keescook@chromium.org)
           - change evaluation to only compare the ACTION so that layered
             errnos don't result in the lowest one being returned.
             (keeschook@chromium.org)
      v11: - check for NULL filter (keescook@chromium.org)
      v10: - change loaders to fn
       v9: - n/a
       v8: - update Kconfig to note new need for syscall_set_return_value.
           - reordered such that TRAP behavior follows on later.
           - made the for loop a little less indent-y
       v7: - introduced
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      acf3b2c7
    • W
      seccomp: add system call filtering using BPF · e2cfabdf
      Will Drewry 提交于
      [This patch depends on luto@mit.edu's no_new_privs patch:
         https://lkml.org/lkml/2012/1/30/264
       The whole series including Andrew's patches can be found here:
         https://github.com/redpig/linux/tree/seccomp
       Complete diff here:
         https://github.com/redpig/linux/compare/1dc65fed...seccomp
      ]
      
      This patch adds support for seccomp mode 2.  Mode 2 introduces the
      ability for unprivileged processes to install system call filtering
      policy expressed in terms of a Berkeley Packet Filter (BPF) program.
      This program will be evaluated in the kernel for each system call
      the task makes and computes a result based on data in the format
      of struct seccomp_data.
      
      A filter program may be installed by calling:
        struct sock_fprog fprog = { ... };
        ...
        prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &fprog);
      
      The return value of the filter program determines if the system call is
      allowed to proceed or denied.  If the first filter program installed
      allows prctl(2) calls, then the above call may be made repeatedly
      by a task to further reduce its access to the kernel.  All attached
      programs must be evaluated before a system call will be allowed to
      proceed.
      
      Filter programs will be inherited across fork/clone and execve.
      However, if the task attaching the filter is unprivileged
      (!CAP_SYS_ADMIN) the no_new_privs bit will be set on the task.  This
      ensures that unprivileged tasks cannot attach filters that affect
      privileged tasks (e.g., setuid binary).
      
      There are a number of benefits to this approach. A few of which are
      as follows:
      - BPF has been exposed to userland for a long time
      - BPF optimization (and JIT'ing) are well understood
      - Userland already knows its ABI: system call numbers and desired
        arguments
      - No time-of-check-time-of-use vulnerable data accesses are possible.
      - system call arguments are loaded on access only to minimize copying
        required for system call policy decisions.
      
      Mode 2 support is restricted to architectures that enable
      HAVE_ARCH_SECCOMP_FILTER.  In this patch, the primary dependency is on
      syscall_get_arguments().  The full desired scope of this feature will
      add a few minor additional requirements expressed later in this series.
      Based on discussion, SECCOMP_RET_ERRNO and SECCOMP_RET_TRACE seem to be
      the desired additional functionality.
      
      No architectures are enabled in this patch.
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Reviewed-by: NIndan Zupancic <indan@nul.nu>
      Acked-by: NEric Paris <eparis@redhat.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      
      v18: - rebase to v3.4-rc2
           - s/chk/check/ (akpm@linux-foundation.org,jmorris@namei.org)
           - allocate with GFP_KERNEL|__GFP_NOWARN (indan@nul.nu)
           - add a comment for get_u32 regarding endianness (akpm@)
           - fix other typos, style mistakes (akpm@)
           - added acked-by
      v17: - properly guard seccomp filter needed headers (leann@ubuntu.com)
           - tighten return mask to 0x7fff0000
      v16: - no change
      v15: - add a 4 instr penalty when counting a path to account for seccomp_filter
             size (indan@nul.nu)
           - drop the max insns to 256KB (indan@nul.nu)
           - return ENOMEM if the max insns limit has been hit (indan@nul.nu)
           - move IP checks after args (indan@nul.nu)
           - drop !user_filter check (indan@nul.nu)
           - only allow explicit bpf codes (indan@nul.nu)
           - exit_code -> exit_sig
      v14: - put/get_seccomp_filter takes struct task_struct
             (indan@nul.nu,keescook@chromium.org)
           - adds seccomp_chk_filter and drops general bpf_run/chk_filter user
           - add seccomp_bpf_load for use by net/core/filter.c
           - lower max per-process/per-hierarchy: 1MB
           - moved nnp/capability check prior to allocation
             (all of the above: indan@nul.nu)
      v13: - rebase on to 88ebdda6
      v12: - added a maximum instruction count per path (indan@nul.nu,oleg@redhat.com)
           - removed copy_seccomp (keescook@chromium.org,indan@nul.nu)
           - reworded the prctl_set_seccomp comment (indan@nul.nu)
      v11: - reorder struct seccomp_data to allow future args expansion (hpa@zytor.com)
           - style clean up, @compat dropped, compat_sock_fprog32 (indan@nul.nu)
           - do_exit(SIGSYS) (keescook@chromium.org, luto@mit.edu)
           - pare down Kconfig doc reference.
           - extra comment clean up
      v10: - seccomp_data has changed again to be more aesthetically pleasing
             (hpa@zytor.com)
           - calling convention is noted in a new u32 field using syscall_get_arch.
             This allows for cross-calling convention tasks to use seccomp filters.
             (hpa@zytor.com)
           - lots of clean up (thanks, Indan!)
       v9: - n/a
       v8: - use bpf_chk_filter, bpf_run_filter. update load_fns
           - Lots of fixes courtesy of indan@nul.nu:
           -- fix up load behavior, compat fixups, and merge alloc code,
           -- renamed pc and dropped __packed, use bool compat.
           -- Added a hidden CONFIG_SECCOMP_FILTER to synthesize non-arch
              dependencies
       v7:  (massive overhaul thanks to Indan, others)
           - added CONFIG_HAVE_ARCH_SECCOMP_FILTER
           - merged into seccomp.c
           - minimal seccomp_filter.h
           - no config option (part of seccomp)
           - no new prctl
           - doesn't break seccomp on systems without asm/syscall.h
             (works but arg access always fails)
           - dropped seccomp_init_task, extra free functions, ...
           - dropped the no-asm/syscall.h code paths
           - merges with network sk_run_filter and sk_chk_filter
       v6: - fix memory leak on attach compat check failure
           - require no_new_privs || CAP_SYS_ADMIN prior to filter
             installation. (luto@mit.edu)
           - s/seccomp_struct_/seccomp_/ for macros/functions (amwang@redhat.com)
           - cleaned up Kconfig (amwang@redhat.com)
           - on block, note if the call was compat (so the # means something)
       v5: - uses syscall_get_arguments
             (indan@nul.nu,oleg@redhat.com, mcgrathr@chromium.org)
            - uses union-based arg storage with hi/lo struct to
              handle endianness.  Compromises between the two alternate
              proposals to minimize extra arg shuffling and account for
              endianness assuming userspace uses offsetof().
              (mcgrathr@chromium.org, indan@nul.nu)
            - update Kconfig description
            - add include/seccomp_filter.h and add its installation
            - (naive) on-demand syscall argument loading
            - drop seccomp_t (eparis@redhat.com)
       v4:  - adjusted prctl to make room for PR_[SG]ET_NO_NEW_PRIVS
            - now uses current->no_new_privs
              (luto@mit.edu,torvalds@linux-foundation.com)
            - assign names to seccomp modes (rdunlap@xenotime.net)
            - fix style issues (rdunlap@xenotime.net)
            - reworded Kconfig entry (rdunlap@xenotime.net)
       v3:  - macros to inline (oleg@redhat.com)
            - init_task behavior fixed (oleg@redhat.com)
            - drop creator entry and extra NULL check (oleg@redhat.com)
            - alloc returns -EINVAL on bad sizing (serge.hallyn@canonical.com)
            - adds tentative use of "always_unprivileged" as per
              torvalds@linux-foundation.org and luto@mit.edu
       v2:  - (patch 2 only)
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      e2cfabdf
  8. 24 3月, 2012 1 次提交
  9. 16 3月, 2012 1 次提交
    • C
      [PATCH v3] ipc: provide generic compat versions of IPC syscalls · 48b25c43
      Chris Metcalf 提交于
      When using the "compat" APIs, architectures will generally want to
      be able to make direct syscalls to msgsnd(), shmctl(), etc., and
      in the kernel we would want them to be handled directly by
      compat_sys_xxx() functions, as is true for other compat syscalls.
      
      However, for historical reasons, several of the existing compat IPC
      syscalls do not do this.  semctl() expects a pointer to the fourth
      argument, instead of the fourth argument itself.  msgsnd(), msgrcv()
      and shmat() expect arguments in different order.
      
      This change adds an ARCH_WANT_OLD_COMPAT_IPC config option that can be
      set to preserve this behavior for ports that use it (x86, sparc, powerpc,
      s390, and mips).  No actual semantics are changed for those architectures,
      and there is only a minimal amount of code refactoring in ipc/compat.c.
      
      Newer architectures like tile (and perhaps future architectures such
      as arm64 and unicore64) should not select this option, and thus can
      avoid having any IPC-specific code at all in their architecture-specific
      compat layer.  In the same vein, if this option is not selected, IPC_64
      mode is assumed, since that's what the <asm-generic> headers expect.
      
      The workaround code in "tile" for msgsnd() and msgrcv() is removed
      with this change; it also fixes the bug that shmat() and semctl() were
      not being properly handled.
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      48b25c43
  10. 24 2月, 2012 1 次提交
    • I
      static keys: Introduce 'struct static_key', static_key_true()/false() and... · c5905afb
      Ingo Molnar 提交于
      static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()
      
      So here's a boot tested patch on top of Jason's series that does
      all the cleanups I talked about and turns jump labels into a
      more intuitive to use facility. It should also address the
      various misconceptions and confusions that surround jump labels.
      
      Typical usage scenarios:
      
              #include <linux/static_key.h>
      
              struct static_key key = STATIC_KEY_INIT_TRUE;
      
              if (static_key_false(&key))
                      do unlikely code
              else
                      do likely code
      
      Or:
      
              if (static_key_true(&key))
                      do likely code
              else
                      do unlikely code
      
      The static key is modified via:
      
              static_key_slow_inc(&key);
              ...
              static_key_slow_dec(&key);
      
      The 'slow' prefix makes it abundantly clear that this is an
      expensive operation.
      
      I've updated all in-kernel code to use this everywhere. Note
      that I (intentionally) have not pushed through the rename
      blindly through to the lowest levels: the actual jump-label
      patching arch facility should be named like that, so we want to
      decouple jump labels from the static-key facility a bit.
      
      On non-jump-label enabled architectures static keys default to
      likely()/unlikely() branches.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: mathieu.desnoyers@efficios.com
      Cc: davem@davemloft.net
      Cc: ddaney.cavm@gmail.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.huSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c5905afb
  11. 22 2月, 2012 1 次提交
  12. 17 2月, 2012 2 次提交
    • I
      uprobes/core: Clean up, refactor and improve the code · 7b2d81d4
      Ingo Molnar 提交于
      Make the uprobes code readable to me:
      
       - improve the Kconfig text so that a mere mortal gets some idea
         what CONFIG_UPROBES=y is really about
      
       - do trivial renames to standardize around the uprobes_*() namespace
      
       - clean up and simplify various code flow details
      
       - separate basic blocks of functionality
      
       - line break artifact and white space related removal
      
       - use standard local varible definition blocks
      
       - use vertical spacing to make things more readable
      
       - remove unnecessary volatile
      
       - restructure comment blocks to make them more uniform and
         more readable in general
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Link: http://lkml.kernel.org/n/tip-ewbwhb8o6navvllsauu7k07p@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      7b2d81d4
    • S
      uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints · 2b144498
      Srikar Dronamraju 提交于
      Add uprobes support to the core kernel, with x86 support.
      
      This commit adds the kernel facilities, the actual uprobes
      user-space ABI and perf probe support comes in later commits.
      
      General design:
      
      Uprobes are maintained in an rb-tree indexed by inode and offset
      (the offset here is from the start of the mapping). For a unique
      (inode, offset) tuple, there can be at most one uprobe in the
      rb-tree.
      
      Since the (inode, offset) tuple identifies a unique uprobe, more
      than one user may be interested in the same uprobe. This provides
      the ability to connect multiple 'consumers' to the same uprobe.
      
      Each consumer defines a handler and a filter (optional). The
      'handler' is run every time the uprobe is hit, if it matches the
      'filter' criteria.
      
      The first consumer of a uprobe causes the breakpoint to be
      inserted at the specified address and subsequent consumers are
      appended to this list.  On subsequent probes, the consumer gets
      appended to the existing list of consumers. The breakpoint is
      removed when the last consumer unregisters. For all other
      unregisterations, the consumer is removed from the list of
      consumers.
      
      Given a inode, we get a list of the mms that have mapped the
      inode. Do the actual registration if mm maps the page where a
      probe needs to be inserted/removed.
      
      We use a temporary list to walk through the vmas that map the
      inode.
      
      - The number of maps that map the inode, is not known before we
        walk the rmap and keeps changing.
      - extending vm_area_struct wasn't recommended, it's a
        size-critical data structure.
      - There can be more than one maps of the inode in the same mm.
      
      We add callbacks to the mmap methods to keep an eye on text vmas
      that are of interest to uprobes.  When a vma of interest is mapped,
      we insert the breakpoint at the right address.
      
      Uprobe works by replacing the instruction at the address defined
      by (inode, offset) with the arch specific breakpoint
      instruction. We save a copy of the original instruction at the
      uprobed address.
      
      This is needed for:
      
       a. executing the instruction out-of-line (xol).
       b. instruction analysis for any subsequent fixups.
       c. restoring the instruction back when the uprobe is unregistered.
      
      We insert or delete a breakpoint instruction, and this
      breakpoint instruction is assumed to be the smallest instruction
      available on the platform. For fixed size instruction platforms
      this is trivially true, for variable size instruction platforms
      the breakpoint instruction is typically the smallest (often a
      single byte).
      
      Writing the instruction is done by COWing the page and changing
      the instruction during the copy, this even though most platforms
      allow atomic writes of the breakpoint instruction. This also
      mirrors the behaviour of a ptrace() memory write to a PRIVATE
      file map.
      
      The core worker is derived from KSM's replace_page() logic.
      
      In essence, similar to KSM:
      
       a. allocate a new page and copy over contents of the page that
          has the uprobed vaddr
       b. modify the copy and insert the breakpoint at the required
          address
       c. switch the original page with the copy containing the
          breakpoint
       d. flush page tables.
      
      replace_page() is being replicated here because of some minor
      changes in the type of pages and also because Hugh Dickins had
      plans to improve replace_page() for KSM specific work.
      
      Instruction analysis on x86 is based on instruction decoder and
      determines if an instruction can be probed and determines the
      necessary fixups after singlestep.  Instruction analysis is done
      at probe insertion time so that we avoid having to repeat the
      same analysis every time a probe is hit.
      
      A lot of code here is due to the improvement/suggestions/inputs
      from Peter Zijlstra.
      
      Changelog:
      
      (v10):
       - Add code to clear REX.B prefix as suggested by Denys Vlasenko
         and Masami Hiramatsu.
      
      (v9):
       - Use insn_offset_modrm as suggested by Masami Hiramatsu.
      
      (v7):
      
       Handle comments from Peter Zijlstra:
      
       - Dont take reference to inode. (expect inode to uprobe_register to be sane).
       - Use PTR_ERR to set the return value.
       - No need to take reference to inode.
       - use PTR_ERR to return error value.
       - register and uprobe_unregister share code.
      
      (v5):
      
       - Modified del_consumer as per comments from Peter.
       - Drop reference to inode before dropping reference to uprobe.
       - Use i_size_read(inode) instead of inode->i_size.
       - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called.
       - Includes errno.h as recommended by Stephen Rothwell to fix a build issue
         on sparc defconfig
       - Remove restrictions while unregistering.
       - Earlier code leaked inode references under some conditions while
         registering/unregistering.
       - Continue the vma-rmap walk even if the intermediate vma doesnt
         meet the requirements.
       - Validate the vma found by find_vma before inserting/removing the
         breakpoint
       - Call del_consumer under mutex_lock.
       - Use hash locks.
       - Handle mremap.
       - Introduce find_least_offset_node() instead of close match logic in
         find_uprobe
       - Uprobes no more depends on MM_OWNER; No reference to task_structs
         while inserting/removing a probe.
       - Uses read_mapping_page instead of grab_cache_page so that the pages
         have valid content.
       - pass NULL to get_user_pages for the task parameter.
       - call SetPageUptodate on the new page allocated in write_opcode.
       - fix leaking a reference to the new page under certain conditions.
       - Include Instruction Decoder if Uprobes gets defined.
       - Remove const attributes for instruction prefix arrays.
       - Uses mm_context to know if the application is 32 bit.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Also-written-by: NJim Keniston <jkenisto@us.ibm.com>
      Reviewed-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linux-mm <linux-mm@kvack.org>
      Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com
      [ Made various small edits to the commit log ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2b144498
  13. 13 1月, 2012 3 次提交
  14. 04 11月, 2011 1 次提交
    • R
      oprofile, x86: Reimplement nmi timer mode using perf event · dcfce4a0
      Robert Richter 提交于
      The legacy x86 nmi watchdog code was removed with the implementation
      of the perf based nmi watchdog. This broke Oprofile's nmi timer
      mode. To run nmi timer mode we relied on a continuous ticking nmi
      source which the nmi watchdog provided. The nmi tick was no longer
      available and current watchdog can not be used anymore since it runs
      with very long periods in the range of seconds. This patch
      reimplements the nmi timer mode using a perf counter nmi source.
      
      V2:
      * removing pr_info()
      * fix undefined reference to `__udivdi3' for 32 bit build
      * fix section mismatch of .cpuinit.data:nmi_timer_cpu_nb
      * removed nmi timer setup in arch/x86
      * implemented function stubs for op_nmi_init/exit()
      * made code more readable in oprofile_init()
      
      V3:
      * fix architectural initialization in oprofile_init()
      * fix CONFIG_OPROFILE_NMI_TIMER dependencies
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      dcfce4a0
  15. 03 8月, 2011 1 次提交
  16. 25 5月, 2011 1 次提交
  17. 10 4月, 2011 1 次提交
  18. 16 3月, 2011 1 次提交
  19. 15 2月, 2011 1 次提交
  20. 05 1月, 2011 1 次提交
    • G
      [S390] mutex: Introduce arch_mutex_cpu_relax() · 34b133f8
      Gerald Schaefer 提交于
      The spinning mutex implementation uses cpu_relax() in busy loops as a
      compiler barrier. Depending on the architecture, cpu_relax() may do more
      than needed in this specific mutex spin loops. On System z we also give
      up the time slice of the virtual cpu in cpu_relax(), which prevents
      effective spinning on the mutex.
      
      This patch replaces cpu_relax() in the spinning mutex code with
      arch_mutex_cpu_relax(), which can be defined by each architecture that
      selects HAVE_ARCH_MUTEX_CPU_RELAX. The default is still cpu_relax(), so
      this patch should not affect other architectures than System z for now.
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1290437256.7455.4.camel@thinkpad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      34b133f8
  21. 26 11月, 2010 1 次提交
    • G
      mutexes, sched: Introduce arch_mutex_cpu_relax() · 335d7afb
      Gerald Schaefer 提交于
      The spinning mutex implementation uses cpu_relax() in busy loops as a
      compiler barrier. Depending on the architecture, cpu_relax() may do more
      than needed in this specific mutex spin loops. On System z we also give
      up the time slice of the virtual cpu in cpu_relax(), which prevents
      effective spinning on the mutex.
      
      This patch replaces cpu_relax() in the spinning mutex code with
      arch_mutex_cpu_relax(), which can be defined by each architecture that
      selects HAVE_ARCH_MUTEX_CPU_RELAX. The default is still cpu_relax(), so
      this patch should not affect other architectures than System z for now.
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1290437256.7455.4.camel@thinkpad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      335d7afb
  22. 30 10月, 2010 1 次提交
    • S
      jump label: Add work around to i386 gcc asm goto bug · 45f81b1c
      Steven Rostedt 提交于
      On i386 (not x86_64) early implementations of gcc would have a bug
      with asm goto causing it to produce code like the following:
      
      (This was noticed by Peter Zijlstra)
      
         56 pushl 0
         67 nopl         jmp 0x6f
            popl
            jmp 0x8c
      
         6f              mov
                         test
                         je 0x8c
      
         8c mov
            call *(%esp)
      
      The jump added in the asm goto skipped over the popl that matched
      the pushl 0, which lead up to a quick crash of the system when
      the jump was enabled. The nopl is defined in the asm goto () statement
      and when tracepoints are enabled, the nop changes to a jump to the label
      that was specified by the asm goto. asm goto is suppose to tell gcc that
      the code in the asm might jump to an external label. Here gcc obviously
      fails to make that work.
      
      The bug report for gcc is here:
      
        http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46226
      
      The bug only appears on x86 when not compiled with
      -maccumulate-outgoing-args. This option is always set on x86_64 and it
      is also the work around for a function graph tracer i386 bug.
      (See commit: 746357d6)
      This explains why the bug only showed up on i386 when function graph
      tracer was not enabled.
      
      This patch now adds a CONFIG_JUMP_LABEL option that is default
      off instead of using jump labels by default. When jump labels are
      enabled, the -maccumulate-outgoing-args will be used (causing a
      slightly larger kernel image on i386). This option will exist
      until we have a way to detect if the gcc compiler in use is safe
      to use on all configurations without the work around.
      
      Note, there exists such a test, but for now we will keep the enabling
      of jump label as a manual option.
      
      Archs that know the compiler is safe with asm goto, may choose to
      select JUMP_LABEL and enable it by default.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cause-discovered-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: David Daney <ddaney@caviumnetworks.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Richard Henderson <rth@redhat.com>
      LKML-Reference: <1288028746.3673.11.camel@laptop>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      45f81b1c
  23. 23 9月, 2010 1 次提交
    • J
      jump label: Base patch for jump label · bf5438fc
      Jason Baron 提交于
      base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
      assembly gcc mechanism, we can now branch to labels from an 'asm goto'
      statment. This allows us to create a 'no-op' fastpath, which can subsequently
      be patched with a jump to the slowpath code. This is useful for code which
      might be rarely used, but which we'd like to be able to call, if needed.
      Tracepoints are the current usecase that these are being implemented for.
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com>
      
      [ cleaned up some formating ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bf5438fc
  24. 14 9月, 2010 1 次提交
    • M
      kprobes: Fix Kconfig dependency · 05ed160e
      Masami Hiramatsu 提交于
      Fix Kconfig dependency among Kprobes, optprobe and kallsyms.
      
      Kprobes uses kallsyms_lookup for finding target function and
      checking instruction boundary, thus CONFIG_KPROBES should select
      CONFIG_KALLSYMS.
      
      Optprobe is an optional feature which is supported on x86 arch,
      and it also uses kallsyms_lookup for checking instructions in
      the target function. Since KALLSYMS_ALL just adds symbols of
      kernel variables, it doesn't need to select KALLSYMS_ALL.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Acked-by: Randy Dunlap <randy.dunlap@oracle.com>,
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Felipe Contreras <felipe.contreras@gmail.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: akpm <akpm@linux-foundation.org>
      LKML-Reference: <20100913102541.20260.85700.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      05ed160e
  25. 16 5月, 2010 2 次提交
    • F
      lockup_detector: Introduce CONFIG_HARDLOCKUP_DETECTOR · 23637d47
      Frederic Weisbecker 提交于
      This new config is deemed to simplify even more the lockup detector
      dependencies and can make it easier to bring a smooth sorting
      between archs that support the new generic lockup detector and those
      that still have their own, especially for those that are in the
      middle of this migration.
      
      Instead of checking whether we have CONFIG_LOCKUP_DETECTOR +
      CONFIG_PERF_EVENTS_NMI each time an arch wants to know if it needs
      to build its own lockup detector, take a shortcut with this new
      config. It is enabled only if the hardlockup detection part of
      the whole lockup detector is on.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      23637d47
    • F
      lockup_detector: Adapt CONFIG_PERF_EVENT_NMI to other archs · c01d4323
      Frederic Weisbecker 提交于
      CONFIG_PERF_EVENT_NMI is something that need to be enabled from the
      arch. This is fine on x86 as PERF_EVENTS is builtin but if other
      archs select it, they will need to handle the PERF_EVENTS dependency.
      
      Instead, handle the dependency in the generic layer:
      
      - archs need to tell what they support through HAVE_PERF_EVENTS_NMI
      - Enable magically PERF_EVENTS_NMI if we have PERF_EVENTS and
        HAVE_PERF_EVENTS_NMI.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      c01d4323
  26. 01 5月, 2010 1 次提交
    • F
      hw-breakpoints: Separate constraint space for data and instruction breakpoints · 0102752e
      Frederic Weisbecker 提交于
      There are two outstanding fashions for archs to implement hardware
      breakpoints.
      
      The first is to separate breakpoint address pattern definition
      space between data and instruction breakpoints. We then have
      typically distinct instruction address breakpoint registers
      and data address breakpoint registers, delivered with
      separate control registers for data and instruction breakpoints
      as well. This is the case of PowerPc and ARM for example.
      
      The second consists in having merged breakpoint address space
      definition between data and instruction breakpoint. Address
      registers can host either instruction or data address and
      the access mode for the breakpoint is defined in a control
      register. This is the case of x86 and Super H.
      
      This patch adds a new CONFIG_HAVE_MIXED_BREAKPOINTS_REGS config
      that archs can select if they belong to the second case. Those
      will have their slot allocation merged for instructions and
      data breakpoints.
      
      The others will have a separate slot tracking between data and
      instruction breakpoints.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      0102752e
  27. 16 3月, 2010 1 次提交
    • M
      kprobes: Hide CONFIG_OPTPROBES and set if arch supports optimized kprobes · 5cc718b9
      Masami Hiramatsu 提交于
      Hide CONFIG_OPTPROBES and set if the arch supports optimized
      kprobes (IOW, HAVE_OPTPROBES=y), since this option doesn't
      change the major behavior of kprobes, and workarounds for minor
      changes are documented.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Dieter Ries <mail@dieterries.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100315170054.31593.3153.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5cc718b9
  28. 26 2月, 2010 4 次提交
    • R
      oprofile/x86: remove OPROFILE_IBS config option · 013cfc50
      Robert Richter 提交于
      OProfile support for IBS is now for several versions in the
      kernel. The feature is stable now and the code can be activated
      permanently.
      
      As a side effect IBS now works also on nosmp configs.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      013cfc50
    • R
      oprofile: remove EXPERIMENTAL from the config option description · b309a294
      Robert Richter 提交于
      OProfile is already used for a long time and no longer experimental.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      b309a294
    • R
      oprofile: remove tracing build dependency · 18b4a4d5
      Robert Richter 提交于
      The commit
      
       1155de47 ring-buffer: Make it generally available
      
      already made ring-buffer available without the TRACING option
      enabled. This patch removes the TRACING dependency from oprofile.
      
      Fixes also oprofile configuration on ia64.
      
      The patch also applies to the 2.6.32-stable kernel.
      Reported-by: NTony Jones <tonyj@suse.de>
      Cc: stable@kernel.org
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      18b4a4d5
    • M
      kprobes: Introduce kprobes jump optimization · afd66255
      Masami Hiramatsu 提交于
      Introduce kprobes jump optimization arch-independent parts.
      Kprobes uses breakpoint instruction for interrupting execution
      flow, on some architectures, it can be replaced by a jump
      instruction and interruption emulation code. This gains kprobs'
      performance drastically.
      
      To enable this feature, set CONFIG_OPTPROBES=y (default y if the
      arch supports OPTPROBE).
      
      Changes in v9:
       - Fix a bug to optimize probe when enabling.
       - Check nearby probes can be optimize/unoptimize when disarming/arming
         kprobes, instead of registering/unregistering. This will help
         kprobe-tracer because most of probes on it are usually disabled.
      
      Changes in v6:
       - Cleanup coding style for readability.
       - Add comments around get/put_online_cpus().
      
      Changes in v5:
       - Use get_online_cpus()/put_online_cpus() for avoiding text_mutex
         deadlock.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Anders Kaseorg <andersk@ksplice.com>
      Cc: Tim Abbott <tabbott@ksplice.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133407.6725.81992.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      afd66255