1. 13 5月, 2010 1 次提交
    • C
      x86, perf: P4 PMU -- use hash for p4_get_escr_idx() · 72001990
      Cyrill Gorcunov 提交于
      Linear search over all p4 MSRs should be fine if only
      we would not use it in events scheduling routine which
      is pretty time critical. Lets use hashes. It should speed
      scheduling up significantly.
      
      v2: Steven proposed to use more gentle approach than issue
          BUG on error, so we use WARN_ONCE now
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100512174242.GA5190@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      72001990
  2. 08 5月, 2010 4 次提交
    • C
      x86, perf: P4 PMU -- check for proper event index in RAW events · c7993165
      Cyrill Gorcunov 提交于
      RAW events are special and we should be ready for user passing
      in insane event index values.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.315897547@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7993165
    • C
      x86, perf: P4 PMU -- Get rid of redundant check for array index · 3f51b711
      Cyrill Gorcunov 提交于
      The caller already has done such a check.
      And it was wrong anyway, it had to be '>=' rather than '>'
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112717.130386882@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f51b711
    • C
      x86, perf: P4 PMU -- protect sensible procedures from preemption · 137351e0
      Cyrill Gorcunov 提交于
      Steven reported:
      
      |
      | I'm getting:
      |
      | Pid: 3477, comm: perf Not tainted 2.6.34-rc6 #2727
      | Call Trace:
      |  [<ffffffff811c7565>] debug_smp_processor_id+0xd5/0xf0
      |  [<ffffffff81019874>] p4_hw_config+0x2b/0x15c
      |  [<ffffffff8107acbc>] ? trace_hardirqs_on_caller+0x12b/0x14f
      |  [<ffffffff81019143>] hw_perf_event_init+0x468/0x7be
      |  [<ffffffff810782fd>] ? debug_mutex_init+0x31/0x3c
      |  [<ffffffff810c68b2>] T.850+0x273/0x42e
      |  [<ffffffff810c6cab>] sys_perf_event_open+0x23e/0x3f1
      |  [<ffffffff81009e6a>] ? sysret_check+0x2e/0x69
      |  [<ffffffff81009e32>] system_call_fastpath+0x16/0x1b
      |
      | When running perf record in latest tip/perf/core
      |
      
      Due to the fact that p4 counters are shared between HT threads
      we synthetically divide the whole set of counters into two
      non-intersected subsets. And while we're "borrowing" counters
      from these subsets we should not be preempted (well, strictly
      speaking in p4_hw_config we just pre-set reference to the
      subset which allow to save some cycles in schedule routine
      if it happens on the same cpu). So use get_cpu/put_cpu pair.
      
      Also p4_pmu_schedule_events should use smp_processor_id rather
      than raw_ version. This allow us to catch up preemption issue
      (if there will ever be).
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <20100508112716.963478928@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      137351e0
    • C
      x86, perf: P4 PMU -- configure predefined events · de902d96
      Cyrill Gorcunov 提交于
      If an event is not RAW we should not exit p4_hw_config
      early but call x86_setup_perfctr as well.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      de902d96
  3. 07 5月, 2010 9 次提交
  4. 05 5月, 2010 1 次提交
  5. 03 5月, 2010 2 次提交
  6. 01 5月, 2010 5 次提交
    • F
      hw-breakpoints: Get the number of available registers on boot dynamically · feef47d0
      Frederic Weisbecker 提交于
      The breakpoint generic layer assumes that archs always know in advance
      the static number of address registers available to host breakpoints
      through the HBP_NUM macro.
      
      However this is not true for every archs. For example Arm needs to get
      this information dynamically to handle the compatiblity between
      different versions.
      
      To solve this, this patch proposes to drop the static HBP_NUM macro
      and let the arch provide the number of available slots through a
      new hw_breakpoint_slots() function. For archs that have
      CONFIG_HAVE_MIXED_BREAKPOINTS_REGS selected, it will be called once
      as the number of registers fits for instruction and data breakpoints
      together.
      For the others it will be called first to get the number of
      instruction breakpoint registers and another time to get the
      data breakpoint registers, the targeted type is given as a
      parameter of hw_breakpoint_slots().
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      feef47d0
    • F
      hw-breakpoints: Separate constraint space for data and instruction breakpoints · 0102752e
      Frederic Weisbecker 提交于
      There are two outstanding fashions for archs to implement hardware
      breakpoints.
      
      The first is to separate breakpoint address pattern definition
      space between data and instruction breakpoints. We then have
      typically distinct instruction address breakpoint registers
      and data address breakpoint registers, delivered with
      separate control registers for data and instruction breakpoints
      as well. This is the case of PowerPc and ARM for example.
      
      The second consists in having merged breakpoint address space
      definition between data and instruction breakpoint. Address
      registers can host either instruction or data address and
      the access mode for the breakpoint is defined in a control
      register. This is the case of x86 and Super H.
      
      This patch adds a new CONFIG_HAVE_MIXED_BREAKPOINTS_REGS config
      that archs can select if they belong to the second case. Those
      will have their slot allocation merged for instructions and
      data breakpoints.
      
      The others will have a separate slot tracking between data and
      instruction breakpoints.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      0102752e
    • F
      hw-breakpoints: Change/Enforce some breakpoints policies · b2812d03
      Frederic Weisbecker 提交于
      The current policies of breakpoints in x86 and SH are the following:
      
      - task bound breakpoints can only break on userspace addresses
      - cpu wide breakpoints can only break on kernel addresses
      
      The former rule prevents ptrace breakpoints to be set to trigger on
      kernel addresses, which is good. But as a side effect, we can't
      breakpoint on kernel addresses for task bound breakpoints.
      
      The latter rule simply makes no sense, there is no reason why we
      can't set breakpoints on userspace while performing cpu bound
      profiles.
      
      We want the following new policies:
      
      - task bound breakpoint can set userspace address breakpoints, with
      no particular privilege required.
      - task bound breakpoints can set kernelspace address breakpoints but
      must be privileged to do that.
      - cpu bound breakpoints can do what they want as they are privileged
      already.
      
      To implement these new policies, this patch checks if we are dealing
      with a kernel address breakpoint, if so and if the exclude_kernel
      parameter is set, we tell the user that the breakpoint is invalid,
      which makes a good generic ptrace protection.
      If we don't have exclude_kernel, ensure the user has the right
      privileges as kernel breakpoints are quite sensitive (risk of
      trap recursion attacks and global performance impacts).
      
      [ Paul Mundt: keep addr space check for sh signal delivery and fix
        double function declaration]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      b2812d03
    • F
      hw-breakpoints: Tag ptrace breakpoint as exclude_kernel · 73266fc1
      Frederic Weisbecker 提交于
      Tag ptrace breakpoints with the exclude_kernel attribute set. This
      will make it easier to set generic policies on breakpoints, when it
      comes to ensure nobody unpriviliged try to breakpoint on the kernel.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      73266fc1
    • P
      x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests · bbd391a1
      Prarit Bhargava 提交于
      Upstream PV guests fail to boot because of a NULL pointer in
      irq_force_complete_move().  It is possible that xen guests have
      irq_desc->chip_data = NULL.
      
      Test for NULL chip_data pointer before attempting to complete an irq move.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: <stable@kernel.org> [2.6.33]
      bbd391a1
  7. 30 4月, 2010 1 次提交
    • L
      x86: Fix 'reservetop=' functionality · e67a807f
      Liang Li 提交于
      When specifying the 'reservetop=0xbadc0de' kernel parameter,
      the kernel will stop booting due to a early_ioremap bug that
      relates to commit 8827247f.
      
      The root cause of boot failure problem is the value of
      'slot_virt[i]' was initialized in setup_arch->early_ioremap_init().
      But later in setup_arch, the function 'parse_early_param' will
      modify 'FIXADDR_TOP' when 'reservetop=0xbadc0de' being specified.
      
      The simplest fix might be use __fix_to_virt(idx0) to get updated
      value of 'FIXADDR_TOP' in '__early_ioremap' instead of reference
      old value from slot_virt[slot] directly.
      
      Changelog since v0:
      
      -v1: When reservetop being handled then FIXADDR_TOP get
           adjusted, Hence check prev_map then re-initialize slot_virt and
           PMD based on new FIXADDR_TOP.
      
      -v2: place fixup_early_ioremap hence call early_ioremap_init in
           reserve_top_address  to re-initialize slot_virt and
           corresponding PMD when parse_reservertop
      
      -v3: move fixup_early_ioremap out of reserve_top_address to make
           sure other clients of reserve_top_address like xen/lguest won't
           broken
      Signed-off-by: NLiang Li <liang.li@windriver.com>
      Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Wang Chen <wangchen@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <1272621711-8683-1-git-send-email-liang.li@windriver.com>
      [ fixed three small cleanliness details in fixup_early_ioremap() ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e67a807f
  8. 29 4月, 2010 1 次提交
  9. 27 4月, 2010 1 次提交
  10. 25 4月, 2010 1 次提交
    • D
      VMware Balloon driver · 453dc659
      Dmitry Torokhov 提交于
      This is a standalone version of VMware Balloon driver.  Ballooning is a
      technique that allows hypervisor dynamically limit the amount of memory
      available to the guest (with guest cooperation).  In the overcommit
      scenario, when hypervisor set detects that it needs to shuffle some
      memory, it instructs the driver to allocate certain number of pages, and
      the underlying memory gets returned to the hypervisor.  Later hypervisor
      may return memory to the guest by reattaching memory to the pageframes and
      instructing the driver to "deflate" balloon.
      
      We are submitting a standalone driver because KVM maintainer (Avi Kivity)
      expressed opinion (rightly) that our transport does not fit well into
      virtqueue paradigm and thus it does not make much sense to integrate with
      virtio.
      
      There were also some concerns whether current ballooning technique is the
      right thing.  If there appears a better framework to achieve this we are
      prepared to evaluate and switch to using it, but in the meantime we'd like
      to get this driver upstream.
      
      We want to get the driver accepted in distributions so that users do not
      have to deal with an out-of-tree module and many distributions have
      "upstream first" requirement.
      
      The driver has been shipping for a number of years and users running on
      VMware platform will have it installed as part of VMware Tools even if it
      will not come from a distribution, thus there should not be additional
      risk in pulling the driver into mainline.  The driver will only activate
      if host is VMware so everyone else should not be affected at all.
      Signed-off-by: NDmitry Torokhov <dtor@vmware.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      453dc659
  11. 24 4月, 2010 2 次提交
    • H
      x86: Disable large pages on CPUs with Atom erratum AAE44 · 7a0fc404
      H. Peter Anvin 提交于
      Atom erratum AAE44/AAF40/AAG38/AAH41:
      
      "If software clears the PS (page size) bit in a present PDE (page
      directory entry), that will cause linear addresses mapped through this
      PDE to use 4-KByte pages instead of using a large page after old TLB
      entries are invalidated. Due to this erratum, if a code fetch uses
      this PDE before the TLB entry for the large page is invalidated then
      it may fetch from a different physical address than specified by
      either the old large page translation or the new 4-KByte page
      translation. This erratum may also cause speculative code fetches from
      incorrect addresses."
      
      [http://download.intel.com/design/processor/specupdt/319536.pdf]
      
      Where as commit 211b3d03 seems to
      workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
      opportunity for the bug to occur and does not totally remove it.  This
      patch disables mixed 4K/4MB page tables totally avoiding the page
      splitting and not tripping this processor issue.
      
      This is based on an original patch by Colin King.
      Originally-by: NColin Ian King <colin.king@canonical.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com>
      Cc: <stable@kernel.org>
      7a0fc404
    • H
      x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero · 7ce5a2b9
      H. Peter Anvin 提交于
      When we do a thread switch, we clear the outgoing FS/GS base if the
      corresponding selector is nonzero.  This is taken by __switch_to() as
      an entry invariant; it does not verify that it is true on entry.
      However, copy_thread() doesn't enforce this constraint, which can
      result in inconsistent results after fork().
      
      Make copy_thread() match the behavior of __switch_to().
      Reported-and-tested-by: NSamuel Thibault <samuel.thibault@inria.fr>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4BD1E061.8030605@zytor.com>
      Cc: <stable@kernel.org>
      7ce5a2b9
  12. 23 4月, 2010 1 次提交
  13. 21 4月, 2010 3 次提交
  14. 20 4月, 2010 8 次提交