1. 15 5月, 2010 1 次提交
  2. 13 5月, 2010 1 次提交
    • C
      x86, perf: P4 PMU -- use hash for p4_get_escr_idx() · 72001990
      Cyrill Gorcunov 提交于
      Linear search over all p4 MSRs should be fine if only
      we would not use it in events scheduling routine which
      is pretty time critical. Lets use hashes. It should speed
      scheduling up significantly.
      
      v2: Steven proposed to use more gentle approach than issue
          BUG on error, so we use WARN_ONCE now
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100512174242.GA5190@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      72001990
  3. 08 5月, 2010 4 次提交
    • C
      x86, perf: P4 PMU -- check for proper event index in RAW events · c7993165
      Cyrill Gorcunov 提交于
      RAW events are special and we should be ready for user passing
      in insane event index values.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.315897547@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7993165
    • C
      x86, perf: P4 PMU -- Get rid of redundant check for array index · 3f51b711
      Cyrill Gorcunov 提交于
      The caller already has done such a check.
      And it was wrong anyway, it had to be '>=' rather than '>'
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.130386882@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f51b711
    • C
      x86, perf: P4 PMU -- protect sensible procedures from preemption · 137351e0
      Cyrill Gorcunov 提交于
      Steven reported:
      
      |
      | I'm getting:
      |
      | Pid: 3477, comm: perf Not tainted 2.6.34-rc6 #2727
      | Call Trace:
      |  [<ffffffff811c7565>] debug_smp_processor_id+0xd5/0xf0
      |  [<ffffffff81019874>] p4_hw_config+0x2b/0x15c
      |  [<ffffffff8107acbc>] ? trace_hardirqs_on_caller+0x12b/0x14f
      |  [<ffffffff81019143>] hw_perf_event_init+0x468/0x7be
      |  [<ffffffff810782fd>] ? debug_mutex_init+0x31/0x3c
      |  [<ffffffff810c68b2>] T.850+0x273/0x42e
      |  [<ffffffff810c6cab>] sys_perf_event_open+0x23e/0x3f1
      |  [<ffffffff81009e6a>] ? sysret_check+0x2e/0x69
      |  [<ffffffff81009e32>] system_call_fastpath+0x16/0x1b
      |
      | When running perf record in latest tip/perf/core
      |
      
      Due to the fact that p4 counters are shared between HT threads
      we synthetically divide the whole set of counters into two
      non-intersected subsets. And while we're "borrowing" counters
      from these subsets we should not be preempted (well, strictly
      speaking in p4_hw_config we just pre-set reference to the
      subset which allow to save some cycles in schedule routine
      if it happens on the same cpu). So use get_cpu/put_cpu pair.
      
      Also p4_pmu_schedule_events should use smp_processor_id rather
      than raw_ version. This allow us to catch up preemption issue
      (if there will ever be).
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112716.963478928@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      137351e0
    • C
      x86, perf: P4 PMU -- configure predefined events · de902d96
      Cyrill Gorcunov 提交于
      If an event is not RAW we should not exit p4_hw_config
      early but call x86_setup_perfctr as well.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      de902d96
  4. 07 5月, 2010 9 次提交
  5. 03 5月, 2010 1 次提交
  6. 01 5月, 2010 3 次提交
    • F
      hw-breakpoints: Change/Enforce some breakpoints policies · b2812d03
      Frederic Weisbecker 提交于
      The current policies of breakpoints in x86 and SH are the following:
      
      - task bound breakpoints can only break on userspace addresses
      - cpu wide breakpoints can only break on kernel addresses
      
      The former rule prevents ptrace breakpoints to be set to trigger on
      kernel addresses, which is good. But as a side effect, we can't
      breakpoint on kernel addresses for task bound breakpoints.
      
      The latter rule simply makes no sense, there is no reason why we
      can't set breakpoints on userspace while performing cpu bound
      profiles.
      
      We want the following new policies:
      
      - task bound breakpoint can set userspace address breakpoints, with
      no particular privilege required.
      - task bound breakpoints can set kernelspace address breakpoints but
      must be privileged to do that.
      - cpu bound breakpoints can do what they want as they are privileged
      already.
      
      To implement these new policies, this patch checks if we are dealing
      with a kernel address breakpoint, if so and if the exclude_kernel
      parameter is set, we tell the user that the breakpoint is invalid,
      which makes a good generic ptrace protection.
      If we don't have exclude_kernel, ensure the user has the right
      privileges as kernel breakpoints are quite sensitive (risk of
      trap recursion attacks and global performance impacts).
      
      [ Paul Mundt: keep addr space check for sh signal delivery and fix
        double function declaration]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      b2812d03
    • F
      hw-breakpoints: Tag ptrace breakpoint as exclude_kernel · 73266fc1
      Frederic Weisbecker 提交于
      Tag ptrace breakpoints with the exclude_kernel attribute set. This
      will make it easier to set generic policies on breakpoints, when it
      comes to ensure nobody unpriviliged try to breakpoint on the kernel.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      73266fc1
    • P
      x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests · bbd391a1
      Prarit Bhargava 提交于
      Upstream PV guests fail to boot because of a NULL pointer in
      irq_force_complete_move().  It is possible that xen guests have
      irq_desc->chip_data = NULL.
      
      Test for NULL chip_data pointer before attempting to complete an irq move.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: <stable@kernel.org> [2.6.33]
      bbd391a1
  7. 25 4月, 2010 1 次提交
    • D
      VMware Balloon driver · 453dc659
      Dmitry Torokhov 提交于
      This is a standalone version of VMware Balloon driver.  Ballooning is a
      technique that allows hypervisor dynamically limit the amount of memory
      available to the guest (with guest cooperation).  In the overcommit
      scenario, when hypervisor set detects that it needs to shuffle some
      memory, it instructs the driver to allocate certain number of pages, and
      the underlying memory gets returned to the hypervisor.  Later hypervisor
      may return memory to the guest by reattaching memory to the pageframes and
      instructing the driver to "deflate" balloon.
      
      We are submitting a standalone driver because KVM maintainer (Avi Kivity)
      expressed opinion (rightly) that our transport does not fit well into
      virtqueue paradigm and thus it does not make much sense to integrate with
      virtio.
      
      There were also some concerns whether current ballooning technique is the
      right thing.  If there appears a better framework to achieve this we are
      prepared to evaluate and switch to using it, but in the meantime we'd like
      to get this driver upstream.
      
      We want to get the driver accepted in distributions so that users do not
      have to deal with an out-of-tree module and many distributions have
      "upstream first" requirement.
      
      The driver has been shipping for a number of years and users running on
      VMware platform will have it installed as part of VMware Tools even if it
      will not come from a distribution, thus there should not be additional
      risk in pulling the driver into mainline.  The driver will only activate
      if host is VMware so everyone else should not be affected at all.
      Signed-off-by: NDmitry Torokhov <dtor@vmware.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      453dc659
  8. 24 4月, 2010 2 次提交
    • H
      x86: Disable large pages on CPUs with Atom erratum AAE44 · 7a0fc404
      H. Peter Anvin 提交于
      Atom erratum AAE44/AAF40/AAG38/AAH41:
      
      "If software clears the PS (page size) bit in a present PDE (page
      directory entry), that will cause linear addresses mapped through this
      PDE to use 4-KByte pages instead of using a large page after old TLB
      entries are invalidated. Due to this erratum, if a code fetch uses
      this PDE before the TLB entry for the large page is invalidated then
      it may fetch from a different physical address than specified by
      either the old large page translation or the new 4-KByte page
      translation. This erratum may also cause speculative code fetches from
      incorrect addresses."
      
      [http://download.intel.com/design/processor/specupdt/319536.pdf]
      
      Where as commit 211b3d03 seems to
      workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
      opportunity for the bug to occur and does not totally remove it.  This
      patch disables mixed 4K/4MB page tables totally avoiding the page
      splitting and not tripping this processor issue.
      
      This is based on an original patch by Colin King.
      Originally-by: NColin Ian King <colin.king@canonical.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com>
      Cc: <stable@kernel.org>
      7a0fc404
    • H
      x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero · 7ce5a2b9
      H. Peter Anvin 提交于
      When we do a thread switch, we clear the outgoing FS/GS base if the
      corresponding selector is nonzero.  This is taken by __switch_to() as
      an entry invariant; it does not verify that it is true on entry.
      However, copy_thread() doesn't enforce this constraint, which can
      result in inconsistent results after fork().
      
      Make copy_thread() match the behavior of __switch_to().
      Reported-and-tested-by: NSamuel Thibault <samuel.thibault@inria.fr>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4BD1E061.8030605@zytor.com>
      Cc: <stable@kernel.org>
      7ce5a2b9
  9. 21 4月, 2010 1 次提交
  10. 20 4月, 2010 1 次提交
    • Z
      perf & kvm: Clean up some of the guest profiling callback API details · dcf46b94
      Zhang, Yanmin 提交于
      Fix some build bug and programming style issues:
      
       - use valid C
       - fix up various style details
      Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sheng Yang <sheng@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: oerg Roedel <joro@8bytes.org>
      Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Zachary Amsden <zamsden@redhat.com>
      Cc: zhiteng.huang@intel.com
      Cc: tim.c.chen@intel.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <1271729638.2078.624.camel@ymzhang.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dcf46b94
  11. 19 4月, 2010 1 次提交
  12. 09 4月, 2010 1 次提交
    • F
      perf: Fix unsafe frame rewinding with hot regs fetching · ab285f2b
      Frederic Weisbecker 提交于
      When we fetch the hot regs and rewind to the nth caller, it
      might happen that we dereference a frame pointer outside the
      kernel stack boundaries, like in this example:
      
      	perf_trace_sched_switch+0xd5/0x120
              schedule+0x6b5/0x860
              retint_careful+0xd/0x21
      
      Since we directly dereference a userspace frame pointer here while
      rewinding behind retint_careful, this may end up in a crash.
      
      Fix this by simply using probe_kernel_address() when we rewind the
      frame pointer.
      
      This issue will have a much more proper fix in the next version of the
      perf_arch_fetch_caller_regs() API that will only need to rewind to the
      first caller.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Archs <linux-arch@vger.kernel.org>
      ab285f2b
  13. 07 4月, 2010 5 次提交
  14. 06 4月, 2010 1 次提交
    • V
      perf, x86: Enable Nehalem-EX support · 134fbadf
      Vince Weaver 提交于
      According to Intel Software Devel Manual Volume 3B, the
      Nehalem-EX PMU is just like regular Nehalem (except for the
      uncore support, which is completely different).
      Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      134fbadf
  15. 04 4月, 2010 1 次提交
  16. 03 4月, 2010 7 次提交