1. 17 6月, 2015 1 次提交
    • P
      x86: don't use module_init in non-modular devicetree.c code · d54b675a
      Paul Gortmaker 提交于
      The devicetree.o is built for "OF" -- which is bool, and hence
      this code is either present or absent.  It will never be modular,
      so using module_init as an alias for __initcall can be somewhat
      misleading.
      
      Fix this up now, so that we can relocate module_init from
      init.h into module.h in the future.  If we don't do this, we'd
      have to add module.h to obviously non-modular code, and that
      would be a worse thing.
      
      Note that direct use of __initcall is discouraged, vs. one
      of the priority categorized subgroups.  As __initcall gets
      mapped onto device_initcall, our use of device_initcall
      directly in this change means that the runtime impact is
      zero -- it will remain at level 6 in initcall ordering.
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      d54b675a
  2. 09 6月, 2015 1 次提交
    • I
      Revert "perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization" · 15c12479
      Ingo Molnar 提交于
      This reverts commit c05199e5.
      
      Vince Weaver reported the following crash while perf fuzzing:
      
      [   79.473121] kernel BUG at mm/vmalloc.c:1335!
      [   79.694391] Call Trace:
      [   79.696997]  <IRQ>
      [   79.699090]  [<ffffffff811b2130>] get_vm_area_caller+0x40/0x50
      [   79.705505]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
      [   79.712414]  [<ffffffff810635e5>] __ioremap_caller+0x195/0x350
      [   79.718610]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
      [   79.725462]  [<ffffffff81427f6b>] ? debug_object_activate+0x14b/0x1e0
      [   79.732346]  [<ffffffff810637b7>] ioremap_nocache+0x17/0x20
      [   79.738283]  [<ffffffff81039f4d>] snb_uncore_imc_init_box+0x6d/0x90
      [   79.744945]  [<ffffffff81039cf7>] snb_uncore_imc_event_start+0xb7/0x110
      [   79.752020]  [<ffffffff81039d97>] snb_uncore_imc_event_add+0x47/0x60
      [   79.758832]  [<ffffffff81162cbb>] event_sched_in.isra.85+0xfb/0x330
      [   79.765519]  [<ffffffff81162f5f>] group_sched_in+0x6f/0x1e0
      [   79.771481]  [<ffffffff8101df1a>] ? native_sched_clock+0x2a/0x90
      [   79.777858]  [<ffffffff811637bc>] __perf_event_enable+0x25c/0x2a0
      [   79.784418]  [<ffffffff810f3e69>] ? tick_nohz_irq_exit+0x29/0x30
      [   79.790820]  [<ffffffff8115ef30>] ? cpu_clock_event_start+0x40/0x40
      [   79.797546]  [<ffffffff8115ef80>] remote_function+0x50/0x60
      [   79.803535]  [<ffffffff810f8cd1>] flush_smp_call_function_queue+0x81/0x180
      [   79.810840]  [<ffffffff810f9763>] generic_smp_call_function_single_interrupt+0x13/0x60
      [   79.819328]  [<ffffffff8104b5e8>] smp_trace_call_function_single_interrupt+0x38/0xc0
      [   79.827614]  [<ffffffff816de9be>] trace_call_function_single_interrupt+0x6e/0x80
      [   79.835465]  <EOI>
      [   79.837543]  [<ffffffff8156e8b5>] ? cpuidle_enter_state+0x65/0x160
      [   79.844377]  [<ffffffff8156e8a1>] ? cpuidle_enter_state+0x51/0x160
      [   79.851015]  [<ffffffff8156e9e7>] cpuidle_enter+0x17/0x20
      [   79.856791]  [<ffffffff810b6e39>] cpu_startup_entry+0x399/0x440
      [   79.863165]  [<ffffffff816c9ddb>] rest_init+0xbb/0xd0
      
      The offending commit is clearly confused as it moves heavy initialization
      work into IPI context.
      
      Revert it.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      15c12479
  3. 07 6月, 2015 1 次提交
  4. 04 6月, 2015 1 次提交
  5. 02 6月, 2015 1 次提交
    • A
      x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers · 425be567
      Andy Lutomirski 提交于
      The early_idt_handlers asm code generates an array of entry
      points spaced nine bytes apart.  It's not really clear from that
      code or from the places that reference it what's going on, and
      the code only works in the first place because GAS never
      generates two-byte JMP instructions when jumping to global
      labels.
      
      Clean up the code to generate the correct array stride (member size)
      explicitly. This should be considerably more robust against
      screw-ups, as GAS will warn if a .fill directive has a negative
      count.  Using '. =' to advance would have been even more robust
      (it would generate an actual error if it tried to move
      backwards), but it would pad with nulls, confusing anyone who
      tries to disassemble the code.  The new scheme should be much
      clearer to future readers.
      
      While we're at it, improve the comments and rename the array and
      common code.
      
      Binutils may start relaxing jumps to non-weak labels.  If so,
      this change will fix our build, and we may need to backport this
      change.
      
      Before, on x86_64:
      
        0000000000000000 <early_idt_handlers>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 00 00 00 00          jmpq   9 <early_idt_handlers+0x9>
                                5: R_X86_64_PC32        early_idt_handler-0x4
        ...
          48:   66 90                   xchg   %ax,%ax
          4a:   6a 08                   pushq  $0x8
          4c:   e9 00 00 00 00          jmpq   51 <early_idt_handlers+0x51>
                                4d: R_X86_64_PC32       early_idt_handler-0x4
        ...
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   e9 00 00 00 00          jmpq   120 <early_idt_handler>
                                11c: R_X86_64_PC32      early_idt_handler-0x4
      
      After:
      
        0000000000000000 <early_idt_handler_array>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 14 01 00 00          jmpq   11d <early_idt_handler_common>
        ...
          48:   6a 08                   pushq  $0x8
          4a:   e9 d1 00 00 00          jmpq   120 <early_idt_handler_common>
          4f:   cc                      int3
          50:   cc                      int3
        ...
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   eb 03                   jmp    120 <early_idt_handler_common>
         11d:   cc                      int3
         11e:   cc                      int3
         11f:   cc                      int3
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Binutils <binutils@sourceware.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H.J. Lu <hjl.tools@gmail.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      425be567
  6. 27 5月, 2015 4 次提交
    • D
      perf/x86: Tweak broken BIOS rules during check_hw_exists() · 68ab7476
      Don Zickus 提交于
      I stumbled upon an AMD box that had the BIOS using a hardware performance
      counter. Instead of printing out a warning and continuing, it failed and
      blocked further perf counter usage.
      
      Looking through the history, I found this commit:
      
        a5ebe0ba ("perf/x86: Check all MSRs before passing hw check")
      
      which tweaked the rules for a Xen guest on an almost identical box and now
      changed the behaviour.
      
      Unfortunately the rules were tweaked incorrectly and will always lead to
      MSR failures even though the MSRs are completely fine.
      
      What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists():
      
      <snip>
              for (i = 0; i < x86_pmu.num_counters; i++) {
                      reg = x86_pmu_config_addr(i);
                      ret = rdmsrl_safe(reg, &val);
                      if (ret)
                              goto msr_fail;
                      if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
                              bios_fail = 1;
                              val_fail = val;
                              reg_fail = reg;
                      }
              }
      
      <snip>
              /*
               * Read the current value, change it and read it back to see if it
               * matches, this is needed to detect certain hardware emulators
               * (qemu/kvm) that don't trap on the MSR access and always return 0s.
               */
              reg = x86_pmu_event_addr(0);
      				^^^^
      
      if the first perf counter is enabled, then this routine will always fail
      because the counter is running. :-(
      
              if (rdmsrl_safe(reg, &val))
                      goto msr_fail;
              val ^= 0xffffUL;
              ret = wrmsrl_safe(reg, val);
              ret |= rdmsrl_safe(reg, &val_new);
              if (ret || val != val_new)
                      goto msr_fail;
      
      The above bios_fail used to be a 'goto' which is why it worked in the past.
      
      Further, most vendors have migrated to using fixed counters to hide their
      evilness hence this problem rarely shows up now days except on a few old boxes.
      
      I fixed my problem and kept the spirit of the original Xen fix, by recording a
      safe non-enable register to be used safely for the reading/writing check.
      Because it is not enabled, this passes on bare metal boxes (like metal), but
      should continue to throw an msr_fail on Xen guests because the register isn't
      emulated yet.
      
      Now I get a proper bios_fail error message and Xen should still see their
      msr_fail message (untested).
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: george.dunlap@eu.citrix.com
      Cc: konrad.wilk@oracle.com
      Link: http://lkml.kernel.org/r/1431976608-56970-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      68ab7476
    • A
      perf/x86/intel/pt: Untangle pt_buffer_reset_markers() · f73ec48c
      Alexander Shishkin 提交于
      Currently, pt_buffer_reset_markers() is a difficult to read knot of
      arithmetics with a redundant check for multiple-entry TOPA capability,
      a commented out wakeup marker placement and a logical error wrt to
      stop marker placement. The latter happens when write head is not page
      aligned and results in stop marker being placed one page earlier than
      it actually should.
      
      All these problems only affect PT implementations that support
      multiple-entry TOPA tables (read: proper scatter-gather).
      
      For single-entry TOPA implementations, there is no functional impact.
      
      This patch deals with all of the above. Tested on both single-entry
      and multiple-entry TOPA PT implementations.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/1432308626-18845-4-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f73ec48c
    • P
      perf/x86: Improve HT workaround GP counter constraint · cc1790cf
      Peter Zijlstra 提交于
      The (SNB/IVB/HSW) HT bug only affects events that can be programmed
      onto GP counters, therefore we should only limit the number of GP
      counters that can be used per cpu -- iow we should not constrain the
      FP counters.
      
      Furthermore, we should only enfore such a limit when there are in fact
      exclusive events being scheduled on either sibling.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      [ Fixed build fail for the !CONFIG_CPU_SUP_INTEL case. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      cc1790cf
    • P
      perf/x86: Fix event/group validation · b371b594
      Peter Zijlstra 提交于
      Commit 43b45780 ("perf/x86: Reduce stack usage of
      x86_schedule_events()") violated the rule that 'fake' scheduling; as
      used for event/group validation; should not change the event state.
      
      This went mostly un-noticed because repeated calls of
      x86_pmu::get_event_constraints() would give the same result. And
      x86_pmu::put_event_constraints() would mostly not do anything.
      
      Commit e979121b ("perf/x86/intel: Implement cross-HT corruption
      bug workaround") made the situation much worse by actually setting the
      event->hw.constraint value to NULL, so when validation and actual
      scheduling interact we get NULL ptr derefs.
      
      Fix it by removing the constraint pointer from the event and move it
      back to an array, this time in cpuc instead of on the stack.
      
      validate_group()
        x86_schedule_events()
          event->hw.constraint = c; # store
      
            <context switch>
              perf_task_event_sched_in()
                ...
                  x86_schedule_events();
                    event->hw.constraint = c2; # store
      
                    ...
      
                    put_event_constraints(event); # assume failure to schedule
                      intel_put_event_constraints()
                        event->hw.constraint = NULL;
      
            <context switch end>
      
          c = event->hw.constraint; # read -> NULL
      
          if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref
      
      This in particular is possible when the event in question is a
      cpu-wide event and group-leader, where the validate_group() tries to
      add an event to the group.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 43b45780 ("perf/x86: Reduce stack usage of x86_schedule_events()")
      Fixes: e979121b ("perf/x86/intel: Implement cross-HT corruption bug workaround")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b371b594
  7. 20 5月, 2015 1 次提交
    • I
      x86/fpu: Disable XSAVES* support for now · e88221c5
      Ingo Molnar 提交于
      The kernel's handling of 'compacted' xsave state layout is buggy:
      
          http://marc.info/?l=linux-kernel&m=142967852317199
      
      I don't have such a system, and the description there is vague, but
      from extrapolation I guess that there were two kinds of bugs
      observed:
      
        - boot crashes, due to size calculations being wrong and the dynamic
          allocation allocating a too small xstate area. (This is now fixed
          in the new FPU code - but still present in stable kernels.)
      
        - FPU state corruption and ABI breakage: if signal handlers try to
          change the FPU state in standard format, which then the kernel
          tries to restore in the compacted format.
      
      These breakages are scary, but they only occur on a small number of
      systems that have XSAVES* CPU support. Yet we have had XSAVES support
      in the upstream kernel for a large number of stable kernel releases,
      and the fixes are involved and unproven.
      
      So do the safe resolution first: disable XSAVES* support and only
      use the standard xstate format. This makes the code work and is
      easy to backport.
      
      On top of this we can work on enabling (and testing!) proper
      compacted format support, without backporting pressure, on top of the
      new, cleaned up FPU code.
      
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e88221c5
  8. 18 5月, 2015 1 次提交
    • B
      x86/mce: Fix MCE severity messages · 17fea54b
      Borislav Petkov 提交于
      Derek noticed that a critical MCE gets reported with the wrong
      error type description:
      
        [Hardware Error]: CPU 34: Machine Check Exception: 5 Bank 9: f200003f000100b0
        [Hardware Error]: RIP !INEXACT! 10:<ffffffff812e14c1> {intel_idle+0xb1/0x170}
        [Hardware Error]: TSC 49587b8e321cb
        [Hardware Error]: PROCESSOR 0:306e4 TIME 1431561296 SOCKET 1 APIC 29
        [Hardware Error]: Some CPUs didn't answer in synchronization
        [Hardware Error]: Machine check: Invalid
      				   ^^^^^^^
      
      The last line with 'Invalid' should have printed the high level
      MCE error type description we get from mce_severity, i.e.
      something like:
      
        [Hardware Error]: Machine check: Action required: data load error in a user process
      
      this happens due to the fact that mce_no_way_out() iterates over
      all MCA banks and possibly overwrites the @msg argument which is
      used in the panic printing later.
      
      Change behavior to take the message of only and the (last)
      critical MCE it detects.
      Reported-by: NDerek <denc716@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: http://lkml.kernel.org/r/1431936437-25286-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      17fea54b
  9. 11 5月, 2015 1 次提交
    • S
      perf/x86/rapl: Enable Broadwell-U RAPL support · 44b11fee
      Stephane Eranian 提交于
      This patch enables RAPL counters (energy consumption counters)
      support for Intel Broadwell-U processors (Model 61):
      
      To use:
      
        $ perf stat -a -I 1000 -e power/energy-cores/,power/energy-pkg/,power/energy-ram/ sleep 10
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jacob.jun.pan@linux.intel.com
      Cc: kan.liang@intel.com
      Cc: peterz@infradead.org
      Cc: sonnyrao@chromium.org
      Link: http://lkml.kernel.org/r/20150423070709.GA4970@thinkpadSigned-off-by: NIngo Molnar <mingo@kernel.org>
      44b11fee
  10. 08 5月, 2015 1 次提交
  11. 06 5月, 2015 3 次提交
  12. 27 4月, 2015 2 次提交
  13. 22 4月, 2015 3 次提交
    • S
      perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver · 0140e614
      Sonny Rao 提交于
      This keeps all the related PCI IDs together in the driver where
      they are used.
      Signed-off-by: NSonny Rao <sonnyrao@chromium.org>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1429644791-25724-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0140e614
    • S
      perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile... · 80bcffb3
      Sonny Rao 提交于
      perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
      
      This uncore is the same as the Haswell desktop part but uses a
      different PCI ID.
      Signed-off-by: NSonny Rao <sonnyrao@chromium.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1429569247-16697-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      80bcffb3
    • J
      perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu · 3b6e0421
      Jiri Olsa 提交于
      The core_pmu does not define cpu_* callbacks, which handles
      allocation of 'struct cpu_hw_events::shared_regs' data,
      initialization of debug store and PMU_FL_EXCL_CNTRS counters.
      
      While this probably won't happen on bare metal, virtual CPU can
      define x86_pmu.extra_regs together with PMU version 1 and thus
      be using core_pmu -> using shared_regs data without it being
      allocated. That could could leave to following panic:
      
      	BUG: unable to handle kernel NULL pointer dereference at (null)
      	IP: [<ffffffff8152cd4f>] _spin_lock_irqsave+0x1f/0x40
      
      	SNIP
      
      	 [<ffffffff81024bd9>] __intel_shared_reg_get_constraints+0x69/0x1e0
      	 [<ffffffff81024deb>] intel_get_event_constraints+0x9b/0x180
      	 [<ffffffff8101e815>] x86_schedule_events+0x75/0x1d0
      	 [<ffffffff810586dc>] ? check_preempt_curr+0x7c/0x90
      	 [<ffffffff810649fe>] ? try_to_wake_up+0x24e/0x3e0
      	 [<ffffffff81064ba2>] ? default_wake_function+0x12/0x20
      	 [<ffffffff8109eb16>] ? autoremove_wake_function+0x16/0x40
      	 [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90
      	 [<ffffffff811a9517>] ? __d_lookup+0xa7/0x150
      	 [<ffffffff8119db5f>] ? do_lookup+0x9f/0x230
      	 [<ffffffff811a993a>] ? dput+0x9a/0x150
      	 [<ffffffff8119c8f5>] ? path_to_nameidata+0x25/0x60
      	 [<ffffffff8119e90a>] ? __link_path_walk+0x7da/0x1000
      	 [<ffffffff8101d8f9>] ? x86_pmu_add+0xb9/0x170
      	 [<ffffffff8101d7a7>] x86_pmu_commit_txn+0x67/0xc0
      	 [<ffffffff811b07b0>] ? mntput_no_expire+0x30/0x110
      	 [<ffffffff8119c731>] ? path_put+0x31/0x40
      	 [<ffffffff8107c297>] ? current_fs_time+0x27/0x30
      	 [<ffffffff8117d170>] ? mem_cgroup_get_reclaim_stat_from_page+0x20/0x70
      	 [<ffffffff8111b7aa>] group_sched_in+0x13a/0x170
      	 [<ffffffff81014a29>] ? sched_clock+0x9/0x10
      	 [<ffffffff8111bac8>] ctx_sched_in+0x2e8/0x330
      	 [<ffffffff8111bb7b>] perf_event_sched_in+0x6b/0xb0
      	 [<ffffffff8111bc36>] perf_event_context_sched_in+0x76/0xc0
      	 [<ffffffff8111eb3b>] perf_event_comm+0x1bb/0x2e0
      	 [<ffffffff81195ee9>] set_task_comm+0x69/0x80
      	 [<ffffffff81195fe1>] setup_new_exec+0xe1/0x2e0
      	 [<ffffffff811ea68e>] load_elf_binary+0x3ce/0x1ab0
      
      Adding cpu_(prepare|starting|dying) for core_pmu to have
      shared_regs data allocated for core_pmu. AFAICS there's no harm
      to initialize debug store and PMU_FL_EXCL_CNTRS either for
      core_pmu.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20150421152623.GC13169@krava.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3b6e0421
  14. 18 4月, 2015 1 次提交
  15. 17 4月, 2015 5 次提交
  16. 16 4月, 2015 2 次提交
    • O
      x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() · fd0f86b6
      Oleg Nesterov 提交于
      When the TIF_SINGLESTEP tracee dequeues a signal,
      handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but
      leaves TIF_SINGLESTEP set.
      
      If the tracer does PTRACE_SINGLESTEP again, enable_single_step()
      sets X86_EFLAGS_TF but not TIF_FORCED_TF.  This means that the
      subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the
      tracee gets the wrong SIGTRAP.
      
      Test-case (needs -O2 to avoid prologue insns in signal handler):
      
      	#include <unistd.h>
      	#include <stdio.h>
      	#include <sys/ptrace.h>
      	#include <sys/wait.h>
      	#include <sys/user.h>
      	#include <assert.h>
      	#include <stddef.h>
      
      	void handler(int n)
      	{
      		asm("nop");
      	}
      
      	int child(void)
      	{
      		assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
      		signal(SIGALRM, handler);
      		kill(getpid(), SIGALRM);
      		return 0x23;
      	}
      
      	void *getip(int pid)
      	{
      		return (void*)ptrace(PTRACE_PEEKUSER, pid,
      					offsetof(struct user, regs.rip), 0);
      	}
      
      	int main(void)
      	{
      		int pid, status;
      
      		pid = fork();
      		if (!pid)
      			return child();
      
      		assert(wait(&status) == pid);
      		assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM);
      
      		assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0);
      		assert(wait(&status) == pid);
      		assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP);
      		assert((getip(pid) - (void*)handler) == 0);
      
      		assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0);
      		assert(wait(&status) == pid);
      		assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP);
      		assert((getip(pid) - (void*)handler) == 1);
      
      		assert(ptrace(PTRACE_CONT, pid, 0,0) == 0);
      		assert(wait(&status) == pid);
      		assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23);
      
      		return 0;
      	}
      
      The last assert() fails because PTRACE_CONT wrongly triggers
      another single-step and X86_EFLAGS_TF can't be cleared by
      debugger until the tracee does sys_rt_sigreturn().
      
      Change handle_signal() to do user_disable_single_step() if
      stepping, we do not need to preserve TIF_SINGLESTEP because we
      are going to do ptrace_notify(), and it is simply wrong to leak
      this bit.
      
      While at it, change the comment to explain why we also need to
      clear TF unconditionally after setup_rt_frame().
      
      Note: in the longer term we should probably change
      setup_sigcontext() to use get_flags() and then just remove this
      user_disable_single_step().  And, the state of TIF_FORCED_TF can
      be wrong after restore_sigcontext() which can set/clear TF, this
      needs another fix.
      
      This fix fixes the 'single_step_syscall_32' testcase in
      the x86 testsuite:
      
      Before:
      
      	~/linux/tools/testing/selftests/x86> ./single_step_syscall_32
      	[RUN]   Set TF and check nop
      	[OK]    Survived with TF set and 9 traps
      	[RUN]   Set TF and check int80
      	[OK]    Survived with TF set and 9 traps
      	[RUN]   Set TF and check a fast syscall
      	[WARN]  Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0
      	Trace/breakpoint trap (core dumped)
      
      After:
      
      	~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32
      	[RUN]   Set TF and check nop
      	[OK]    Survived with TF set and 9 traps
      	[RUN]   Set TF and check int80
      	[OK]    Survived with TF set and 9 traps
      	[RUN]   Set TF and check a fast syscall
      	[OK]    Survived with TF set and 39 traps
      	[RUN]   Fast syscall with TF cleared
      	[OK]    Nothing unexpected happened
      Reported-by: NEvan Teran <eteran@alum.rit.edu>
      Reported-by: NPedro Alves <palves@redhat.com>
      Tested-by: NAndres Freund <andres@anarazel.de>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      [ Added x86 self-test info. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      fd0f86b6
    • J
      x86: mtrr: if: remove use of seq_printf return value · 3ac62bc0
      Joe Perches 提交于
      The seq_printf return value, because it's frequently misused,
      will eventually be converted to void.
      
      See: commit 1f33c41c ("seq_file: Rename seq_overflow() to
           seq_has_overflowed() and make public")
      Signed-off-by: NJoe Perches <joe@perches.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ac62bc0
  17. 15 4月, 2015 2 次提交
  18. 13 4月, 2015 1 次提交
  19. 12 4月, 2015 1 次提交
  20. 11 4月, 2015 5 次提交
  21. 10 4月, 2015 2 次提交