1. 22 12月, 2011 2 次提交
    • S
      x86: Add counter when debug stack is used with interrupts enabled · 42181186
      Steven Rostedt 提交于
      Mathieu Desnoyers pointed out a case that can cause issues with
      NMIs running on the debug stack:
      
        int3 -> interrupt -> NMI -> int3
      
      Because the interrupt changes the stack, the NMI will not see that
      it preempted the debug stack. Looking deeper at this case,
      interrupts only happen when the int3 is from userspace or in
      an a location in the exception table (fixup).
      
        userspace -> int3 -> interurpt -> NMI -> int3
      
      All other int3s that happen in the kernel should be processed
      without ever enabling interrupts, as the do_trap() call will
      panic the kernel if it is called to process any other location
      within the kernel.
      
      Adding a counter around the sections that enable interrupts while
      using the debug stack allows the NMI to also check that case.
      If the NMI sees that it either interrupted a task using the debug
      stack or the debug counter is non-zero, then it will have to
      change the IDT table to make the int3 not change stacks (which will
      corrupt the stack if it does).
      
      Note, I had to move the debug_usage functions out of processor.h
      and into debugreg.h because of the static inlined functions to
      inc and dec the debug_usage counter. __get_cpu_var() requires
      smp.h which includes processor.h, and would fail to build.
      
      Link: http://lkml.kernel.org/r/1323976535.23971.112.camel@gandalf.stny.rr.comReported-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Paul Turner <pjt@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      42181186
    • S
      x86: Keep current stack in NMI breakpoints · 228bdaa9
      Steven Rostedt 提交于
      We want to allow NMI handlers to have breakpoints to be able to
      remove stop_machine from ftrace, kprobes and jump_labels. But if
      an NMI interrupts a current breakpoint, and then it triggers a
      breakpoint itself, it will switch to the breakpoint stack and
      corrupt the data on it for the breakpoint processing that it
      interrupted.
      
      Instead, have the NMI check if it interrupted breakpoint processing
      by checking if the stack that is currently used is a breakpoint
      stack. If it is, then load a special IDT that changes the IST
      for the debug exception to keep the same stack in kernel context.
      When the NMI is done, it puts it back.
      
      This way, if the NMI does trigger a breakpoint, it will keep
      using the same stack and not stomp on the breakpoint data for
      the breakpoint it interrupted.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      228bdaa9
  2. 21 12月, 2011 1 次提交
  3. 14 10月, 2011 1 次提交
  4. 04 8月, 2011 1 次提交
  5. 29 5月, 2011 1 次提交
    • L
      x86 idle: clarify AMD erratum 400 workaround · 02c68a02
      Len Brown 提交于
      The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting:
      1. Intel C1E is somehow involved
      2. All AMD processors with C1E are involved
      
      Use the string "amd_c1e" instead of simply "c1e" to clarify that
      this workaround is specific to AMD's version of C1E.
      Use the string "e400" to clarify that the workaround is specific
      to AMD processors with Erratum 400.
      
      This patch is text-substitution only, with no functional change.
      
      cc: x86@kernel.org
      Acked-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      02c68a02
  6. 26 1月, 2011 1 次提交
    • Y
      x86: Move llc_shared_map out of cpu_info · b3d7336d
      Yinghai Lu 提交于
      cpu_info is already with per_cpu, We can take llc_shared_map out
      of cpu_info, and declare it as per_cpu variable directly.
      
      So later referencing could be simple and directly instead of
      diving to find cpu_info at first.
      
      Also could make smp_store_cpu_info() much simple to avoid to do
      save and restore trick.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: Alok N Kataria <akataria@vmware.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Hans J. Koch <hjk@linutronix.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <4D3A16E8.5020608@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b3d7336d
  7. 13 1月, 2011 1 次提交
    • T
      ACPI, intel_idle: Cleanup idle= internal variables · d1896049
      Thomas Renninger 提交于
      Having four variables for the same thing:
        idle_halt, idle_nomwait, force_mwait and boot_option_idle_overrides
      is rather confusing and unnecessary complex.
      
      if idle= boot param is passed, only set up one variable:
      boot_option_idle_overrides
      
      Introduces following functional changes/fixes:
        - intel_idle driver does not register if any idle=xy
          boot param is passed.
        - processor_idle.c will also not register a cpuidle driver
          and get active if idle=halt is passed.
          Before a cpuidle driver with one (C1, halt) state got registered
          Now the default_idle function will be used which finally uses
          the same idle call to enter sleep state (safe_halt()), but
          without registering a whole cpuidle driver.
      
      That means idle= param will always avoid cpuidle drivers to register
      with one exception (same behavior as before):
      idle=nomwait
      may still register acpi_idle cpuidle driver, but C1 will not use
      mwait, but hlt. This can be a workaround for IO based deeper sleep
      states where C1 mwait causes problems.
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      cc: x86@kernel.org
      Signed-off-by: NLen Brown <len.brown@intel.com>
      d1896049
  8. 30 12月, 2010 1 次提交
  9. 02 11月, 2010 1 次提交
  10. 02 10月, 2010 1 次提交
  11. 18 9月, 2010 1 次提交
    • H
      x86, hotplug: Use mwait to offline a processor, fix the legacy case · ea530692
      H. Peter Anvin 提交于
      The code in native_play_dead() has a number of problems:
      
      1. We should use MWAIT when available, to put ourselves into a deeper
         sleep state.
      2. We use the existence of CLFLUSH to determine if WBINVD is safe, but
         that is totally bogus -- WBINVD is 486+, whereas CLFLUSH is a much
         later addition.
      3. We should do WBINVD inside the loop, just in case of something like
         setting an A bit on page tables.  Pointed out by Arjan van de Ven.
      
      This code is based in part of a previous patch by Venki Pallipadi, but
      unlike that patch this one keeps all the detection code local instead
      of pre-caching a bunch of information.  We're shutting down the CPU;
      there is absolutely no hurry.
      
      This patch moves all the code to C and deletes the global
      wbinvd_halt() which is broken anyway.
      Originally-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.hl>
      LKML-Reference: <20090522232230.162239000@intel.com>
      ea530692
  12. 10 9月, 2010 1 次提交
  13. 02 8月, 2010 1 次提交
  14. 29 7月, 2010 3 次提交
  15. 21 5月, 2010 1 次提交
  16. 11 5月, 2010 1 次提交
  17. 08 5月, 2010 1 次提交
    • H
      x86: Clean up the hypervisor layer · e08cae41
      H. Peter Anvin 提交于
      Clean up the hypervisor layer and the hypervisor drivers, using an ops
      structure instead of an enumeration with if statements.
      
      The identity of the hypervisor, if needed, can be tested by testing
      the pointer value in x86_hyper.
      
      The MS-HyperV private state is moved into a normal global variable
      (it's per-system state, not per-CPU state).  Being a normal bss
      variable, it will be left at all zero on non-HyperV platforms, and so
      can generally be tested for HyperV-specific features without
      additional qualification.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Acked-by: NGreg KH <greg@kroah.com>
      Cc: Hank Janssen <hjanssen@microsoft.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Ky Srinivasan <ksrinivasan@novell.com>
      LKML-Reference: <4BE49778.6060800@zytor.com>
      e08cae41
  18. 07 5月, 2010 1 次提交
    • K
      x86: Detect running on a Microsoft HyperV system · a2a47c6c
      Ky Srinivasan 提交于
      This patch integrates HyperV detection within the framework currently
      used by VmWare. With this patch, we can avoid having to replicate the
      HyperV detection code in each of the Microsoft HyperV drivers.
      
      Reworked and tweaked by Greg K-H to build properly.
      Signed-off-by: NK. Y. Srinivasan <ksrinivasan@novell.com>
      LKML-Reference: <20100506190841.GA1605@kroah.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Vadim Rozenfeld <vrozenfe@redhat.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: "K.Prasad" <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Hank Janssen <hjanssen@microsoft.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      a2a47c6c
  19. 26 3月, 2010 2 次提交
    • P
      x86, ptrace: Fix block-step · ea8e61b7
      Peter Zijlstra 提交于
      Implement ptrace-block-step using TIF_BLOCKSTEP which will set
      DEBUGCTLMSR_BTF when set for a task while preserving any other
      DEBUGCTLMSR bits.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100325135414.017536066@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea8e61b7
    • P
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra 提交于
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      faa4602e
  20. 20 2月, 2010 1 次提交
    • F
      hw-breakpoint: Keep track of dr7 local enable bits · 326264a0
      Frederic Weisbecker 提交于
      When the user enables breakpoints through dr7, he can choose
      between "local" or "global" enable bits but given how linux is
      implemented, both have the same effect.
      
      That said we don't keep track how the user enabled the breakpoints
      so when the user requests the dr7 value, we only translate the
      "enabled" status using the global enabled bits. It means that if
      the user enabled a breakpoint using the local enabled bit, reading
      back dr7 will set the global bit and clear the local one.
      
      Apps like Wine expect a full dr7 POKEUSER/PEEKUSER match for emulated
      softwares that implement old reverse engineering protection schemes.
      
      We fix that by keeping track of the whole dr7 value given by the user
      in the thread structure to drop this bug. We'll think about
      something more proper later.
      
      This fixes a 2.6.32 - 2.6.33-x ptrace regression.
      Reported-and-tested-by: NMichael Stefaniuc <mstefani@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NK.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Maneesh Soni <maneesh@linux.vnet.ibm.com>
      Cc: Alexandre Julliard <julliard@winehq.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
      326264a0
  21. 17 12月, 2009 1 次提交
    • S
      x86, cpuid: Add "volatile" to asm in native_cpuid() · 45a94d7c
      Suresh Siddha 提交于
      xsave_cntxt_init() does something like:
      
      	cpuid(0xd, ..);	// find out what features FP/SSE/.. etc are supported
      
      	xsetbv();	// enable the features known to OS
      
      	cpuid(0xd, ..);	// find out the size of the context for features enabled
      
      Depending on what features get enabled in xsetbv(), value of the
      cpuid.eax=0xd.ecx=0.ebx changes correspondingly (representing the
      size of the context that is enabled).
      
      As we don't have volatile keyword for native_cpuid(), gcc 4.1.2
      optimizes away the second cpuid and the kernel continues to use
      the cpuid information obtained before xsetbv(), ultimately leading to kernel
      crash on processors supporting more state than the legacy FP/SSE.
      
      Add "volatile" for native_cpuid().
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1261009542.2745.55.camel@sbs-t61.sc.intel.com>
      Cc: stable@kernel.org
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      45a94d7c
  22. 08 11月, 2009 1 次提交
    • F
      hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events · 24f1e32c
      Frederic Weisbecker 提交于
      This patch rebase the implementation of the breakpoints API on top of
      perf events instances.
      
      Each breakpoints are now perf events that handle the
      register scheduling, thread/cpu attachment, etc..
      
      The new layering is now made as follows:
      
             ptrace       kgdb      ftrace   perf syscall
                \          |          /         /
                 \         |         /         /
                                              /
                  Core breakpoint API        /
                                            /
                           |               /
                           |              /
      
                    Breakpoints perf events
      
                           |
                           |
      
                     Breakpoints PMU ---- Debug Register constraints handling
                                          (Part of core breakpoint API)
                           |
                           |
      
                   Hardware debug registers
      
      Reasons of this rewrite:
      
      - Use the centralized/optimized pmu registers scheduling,
        implying an easier arch integration
      - More powerful register handling: perf attributes (pinned/flexible
        events, exclusive/non-exclusive, tunable period, etc...)
      
      Impact:
      
      - New perf ABI: the hardware breakpoints counters
      - Ptrace breakpoints setting remains tricky and still needs some per
        thread breakpoints references.
      
      Todo (in the order):
      
      - Support breakpoints perf counter events for perf tools (ie: implement
        perf_bpcounter_event())
      - Support from perf tools
      
      Changes in v2:
      
      - Follow the perf "event " rename
      - The ptrace regression have been fixed (ptrace breakpoint perf events
        weren't released when a task ended)
      - Drop the struct hw_breakpoint and store generic fields in
        perf_event_attr.
      - Separate core and arch specific headers, drop
        asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
      - Use new generic len/type for breakpoint
      - Handle off case: when breakpoints api is not supported by an arch
      
      Changes in v3:
      
      - Fix broken CONFIG_KVM, we need to propagate the breakpoint api
        changes to kvm when we exit the guest and restore the bp registers
        to the host.
      
      Changes in v4:
      
      - Drop the hw_breakpoint_restore() stub as it is only used by KVM
      - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
        module
      - Restore the breakpoints unconditionally on kvm guest exit:
        TIF_DEBUG_THREAD doesn't anymore cover every cases of running
        breakpoints and vcpu->arch.switch_db_regs might not always be
        set when the guest used debug registers.
        (Waiting for a reliable optimization)
      
      Changes in v5:
      
      - Split-up the asm-generic/hw-breakpoint.h moving to
        linux/hw_breakpoint.h into a separate patch
      - Optimize the breakpoints restoring while switching from kvm guest
        to host. We only want to restore the state if we have active
        breakpoints to the host, otherwise we don't care about messed-up
        address registers.
      - Add asm/hw_breakpoint.h to Kbuild
      - Fix bad breakpoint type in trace_selftest.c
      
      Changes in v6:
      
      - Fix wrong header inclusion in trace.h (triggered a build
        error with CONFIG_FTRACE_SELFTEST
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jan Kiszka <jan.kiszka@web.de>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      24f1e32c
  23. 04 11月, 2009 1 次提交
    • S
      x86, fs: Fix x86 procfs stack information for threads on 64-bit · 89240ba0
      Stefani Seibold 提交于
      This patch fixes two issues in the procfs stack information on
      x86-64 linux.
      
      The 32 bit loader compat_do_execve did not store stack
      start. (this was figured out by Alexey Dobriyan).
      
      The stack information on a x64_64 kernel always shows 0 kbyte
      stack usage, because of a missing implementation of the KSTK_ESP
      macro which always returned -1.
      
      The new implementation now returns the right value.
      Signed-off-by: NStefani Seibold <stefani@seibold.net>
      Cc: Americo Wang <xiyou.wangcong@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <1257240160.4889.24.camel@wall-e>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      89240ba0
  24. 16 9月, 2009 1 次提交
  25. 15 9月, 2009 1 次提交
    • P
      x86: Add generic aperf/mperf code · 5cbc19a9
      Peter Zijlstra 提交于
      Move some of the aperf/mperf code out from the cpufreq driver
      thingy so that other people can enjoy it too.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Cc: cpufreq@vger.kernel.org
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5cbc19a9
  26. 11 9月, 2009 1 次提交
  27. 04 9月, 2009 2 次提交
  28. 03 6月, 2009 1 次提交
  29. 19 5月, 2009 1 次提交
  30. 12 5月, 2009 1 次提交
    • H
      x86: add extension fields for bootloader type and version · 5031296c
      H. Peter Anvin 提交于
      A long ago, in days of yore, it all began with a god named Thor.
      There were vikings and boats and some plans for a Linux kernel
      header.  Unfortunately, a single 8-bit field was used for bootloader
      type and version.  This has generally worked without *too* much pain,
      but we're getting close to flat running out of ID fields.
      
      Add extension fields for both type and version.  The type will be
      extended if it the old field is 0xE; the version is a simple MSB
      extension.
      
      Keep /proc/sys/kernel/bootloader_type containing
      (type << 4) + (ver & 0xf) for backwards compatiblity, but also add
      /proc/sys/kernel/bootloader_version which contains the full version
      number.
      
      [ Impact: new feature to support more bootloaders ]
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5031296c
  31. 11 5月, 2009 3 次提交
  32. 22 4月, 2009 1 次提交
    • D
      FRV: Fix the section attribute on UP DECLARE_PER_CPU() · 9b8de747
      David Howells 提交于
      In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
      does not agree with that specified by DEFINE_PER_CPU().  This means that
      architectures that have a small data section references relative to a base
      register may throw up linkage errors due to too great a displacement between
      where the base register points and the per-CPU variable.
      
      On FRV, the .h declaration says that the variable is in the .sdata section, but
      the .c definition says it's actually in the .data section.  The linker throws
      up the following errors:
      
      kernel/built-in.o: In function `release_task':
      kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
      kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
      
      To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
      as does DEFINE_PER_CPU().  However, this is made slightly more complex by
      virtue of the fact that there are several variants on DEFINE, so these need to
      be matched by variants on DECLARE.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b8de747
  33. 12 4月, 2009 1 次提交