1. 10 12月, 2009 2 次提交
  2. 08 11月, 2009 1 次提交
    • F
      hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events · 24f1e32c
      Frederic Weisbecker 提交于
      This patch rebase the implementation of the breakpoints API on top of
      perf events instances.
      
      Each breakpoints are now perf events that handle the
      register scheduling, thread/cpu attachment, etc..
      
      The new layering is now made as follows:
      
             ptrace       kgdb      ftrace   perf syscall
                \          |          /         /
                 \         |         /         /
                                              /
                  Core breakpoint API        /
                                            /
                           |               /
                           |              /
      
                    Breakpoints perf events
      
                           |
                           |
      
                     Breakpoints PMU ---- Debug Register constraints handling
                                          (Part of core breakpoint API)
                           |
                           |
      
                   Hardware debug registers
      
      Reasons of this rewrite:
      
      - Use the centralized/optimized pmu registers scheduling,
        implying an easier arch integration
      - More powerful register handling: perf attributes (pinned/flexible
        events, exclusive/non-exclusive, tunable period, etc...)
      
      Impact:
      
      - New perf ABI: the hardware breakpoints counters
      - Ptrace breakpoints setting remains tricky and still needs some per
        thread breakpoints references.
      
      Todo (in the order):
      
      - Support breakpoints perf counter events for perf tools (ie: implement
        perf_bpcounter_event())
      - Support from perf tools
      
      Changes in v2:
      
      - Follow the perf "event " rename
      - The ptrace regression have been fixed (ptrace breakpoint perf events
        weren't released when a task ended)
      - Drop the struct hw_breakpoint and store generic fields in
        perf_event_attr.
      - Separate core and arch specific headers, drop
        asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
      - Use new generic len/type for breakpoint
      - Handle off case: when breakpoints api is not supported by an arch
      
      Changes in v3:
      
      - Fix broken CONFIG_KVM, we need to propagate the breakpoint api
        changes to kvm when we exit the guest and restore the bp registers
        to the host.
      
      Changes in v4:
      
      - Drop the hw_breakpoint_restore() stub as it is only used by KVM
      - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
        module
      - Restore the breakpoints unconditionally on kvm guest exit:
        TIF_DEBUG_THREAD doesn't anymore cover every cases of running
        breakpoints and vcpu->arch.switch_db_regs might not always be
        set when the guest used debug registers.
        (Waiting for a reliable optimization)
      
      Changes in v5:
      
      - Split-up the asm-generic/hw-breakpoint.h moving to
        linux/hw_breakpoint.h into a separate patch
      - Optimize the breakpoints restoring while switching from kvm guest
        to host. We only want to restore the state if we have active
        breakpoints to the host, otherwise we don't care about messed-up
        address registers.
      - Add asm/hw_breakpoint.h to Kbuild
      - Fix bad breakpoint type in trace_selftest.c
      
      Changes in v6:
      
      - Fix wrong header inclusion in trace.h (triggered a build
        error with CONFIG_FTRACE_SELFTEST
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jan Kiszka <jan.kiszka@web.de>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      24f1e32c
  3. 04 11月, 2009 1 次提交
    • P
      x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread(). · 41a48d14
      Paul Mundt 提交于
      flush_thread() tries to do a TIF_DEBUG check before calling in to
      flush_thread_hw_breakpoint() (which subsequently clears the thread flag),
      but for some reason, the x86 code is manually clearing TIF_DEBUG
      immediately before the test, so this path will never be taken.
      
      This kills off the erroneous clear_tsk_thread_flag() and lets
      flush_thread_hw_breakpoint() actually get invoked.
      
      Presumably folks were getting lucky with testing and the
      free_thread_info() -> free_thread_xstate() path was taking care of the
      flush there.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: N"K.Prasad" <prasad@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      LKML-Reference: <20091005102306.GA7889@linux-sh.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      41a48d14
  4. 02 10月, 2009 1 次提交
    • A
      core, x86: Add user return notifiers · 7c68af6e
      Avi Kivity 提交于
      Add a general per-cpu notifier that is called whenever the kernel is
      about to return to userspace.  The notifier uses a thread_info flag
      and existing checks, so there is no impact on user return or context
      switch fast paths.
      
      This will be used initially to speed up KVM task switching by lazily
      updating MSRs.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      LKML-Reference: <1253342422-13811-1-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      7c68af6e
  5. 24 9月, 2009 1 次提交
  6. 20 9月, 2009 1 次提交
    • A
      tracing, x86, cpuidle: Move the end point of a C state in the power tracer · 288f023e
      Arjan van de Ven 提交于
      The "end of a C state" trace point currently happens before
      the code runs that corrects the TSC for having stopped during idle.
      
      The result of this is that the timestamp of the end-of-C-state event
      is garbage on cpus where the TSC stops during idle.
      
      This patch moves the end point of the C state to after the timekeeping
      engine of the kernel has been corrected.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: fweisbec@gmail.com
      Cc: peterz@infradead.org
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090919133533.139c2a46@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      288f023e
  7. 19 9月, 2009 1 次提交
  8. 20 8月, 2009 1 次提交
    • S
      clockevent: Prevent dead lock on clockevents_lock · f833bab8
      Suresh Siddha 提交于
      Currently clockevents_notify() is called with interrupts enabled at
      some places and interrupts disabled at some other places.
      
      This results in a deadlock in this scenario.
      
      cpu A holds clockevents_lock in clockevents_notify() with irqs enabled
      cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled
      cpu C doing set_mtrr() which will try to rendezvous of all the cpus.
      
      This will result in C and A come to the rendezvous point and waiting
      for B. B is stuck forever waiting for the spinlock and thus not
      reaching the rendezvous point.
      
      Fix the clockevents code so that clockevents_lock is taken with
      interrupts disabled and thus avoid the above deadlock.
      
      Also call lapic_timer_propagate_broadcast() on the destination cpu so
      that we avoid calling smp_call_function() in the clockevents notifier
      chain.
      
      This issue left us wondering if we need to change the MTRR rendezvous
      logic to use stop machine logic (instead of smp_call_function) or add
      a check in spinlock debug code to see if there are other spinlocks
      which gets taken under both interrupts enabled/disabled conditions.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com>
      Cc: "Brown Len" <len.brown@intel.com>
      LKML-Reference: <1250544899.2709.210.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      f833bab8
  9. 15 6月, 2009 1 次提交
    • V
      kmemcheck: add mm functions · 2dff4405
      Vegard Nossum 提交于
      With kmemcheck enabled, the slab allocator needs to do this:
      
      1. Tell kmemcheck to allocate the shadow memory which stores the status of
         each byte in the allocation proper, e.g. whether it is initialized or
         uninitialized.
      2. Tell kmemcheck which parts of memory that should be marked uninitialized.
         There are actually a few more states, such as "not yet allocated" and
         "recently freed".
      
      If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
      memory that can take page faults because of kmemcheck.
      
      If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
      request memory with the __GFP_NOTRACK flag. This does not prevent the page
      faults from occuring, however, but marks the object in question as being
      initialized so that no warnings will ever be produced for this object.
      
      In addition to (and in contrast to) __GFP_NOTRACK, the
      __GFP_NOTRACK_FALSE_POSITIVE flag indicates that the allocation should
      not be tracked _because_ it would produce a false positive. Their values
      are identical, but need not be so in the future (for example, we could now
      enable/disable false positives with a config option).
      
      Parts of this patch were contributed by Pekka Enberg but merged for
      atomicity.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      
      [rebased for mainline inclusion]
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      2dff4405
  10. 03 6月, 2009 2 次提交
  11. 12 5月, 2009 1 次提交
  12. 12 4月, 2009 1 次提交
    • J
      x86: clean up declarations and variables · 2c1b284e
      Jaswinder Singh Rajput 提交于
      Impact: cleanup, no code changed
      
       - syscalls.h       update declarations due to unifications
       - irq.c            declare smp_generic_interrupt() before it gets used
       - process.c        declare sys_fork() and sys_vfork() before they get used
       - tsc.c            rename tsc_khz shadowed variable
       - apic/probe_32.c  declare apic_default before it gets used
       - apic/nmi.c       prev_nmi_count should be unsigned
       - apic/io_apic.c   declare smp_irq_move_cleanup_interrupt() before it gets used
       - mm/init.c        declare direct_gbpages and free_initrd_mem before they get used
      Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2c1b284e
  13. 07 4月, 2009 1 次提交
    • M
      x86, ds: add leakage warning · 2311f0de
      Markus Metzger 提交于
      Add a warning in case a debug store context is not removed before
      the task it is attached to is freed.
      
      Remove the old warning at thread exit. It is too early.
      
      Declare the debug store context field in thread_struct unconditionally.
      
      Remove ds_copy_thread() and ds_exit_thread() and do the work directly
      in process*.c.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Cc: roland@redhat.com
      Cc: eranian@googlemail.com
      Cc: oleg@redhat.com
      Cc: juan.villacis@intel.com
      Cc: ak@linux.jf.intel.com
      LKML-Reference: <20090403144601.254472000@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2311f0de
  14. 18 3月, 2009 1 次提交
    • R
      cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash · 30e1e6d1
      Rusty Russell 提交于
      Impact: Fix cpu offline when CONFIG_MAXSMP=y
      
      Changeset bc9b83dd "cpumask: convert
      c1e_mask in arch/x86/kernel/process.c to cpumask_var_t" contained a
      bug: c1e_mask is manipulated even if C1E isn't detected (and hence
      not allocated).
      
      This is simply fixed by checking for NULL (which gcc optimizes out
      anyway of CONFIG_CPUMASK_OFFSTACK=n, since it knows ce1_mask can never
      be NULL).
      
      In addition, fix a leak where select_idle_routine re-allocates
      (and re-clears) c1e_mask on every cpu init.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Mike Travis <travis@sgi.com>
      LKML-Reference: <200903171450.34549.rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      30e1e6d1
  15. 16 3月, 2009 1 次提交
  16. 13 3月, 2009 2 次提交
  17. 02 3月, 2009 1 次提交
  18. 13 2月, 2009 1 次提交
  19. 09 2月, 2009 2 次提交
  20. 29 1月, 2009 1 次提交
    • I
      x86: replace CONFIG_X86_SMP with CONFIG_SMP · 3e5095d1
      Ingo Molnar 提交于
      The x86/Voyager subarch used to have this distinction between
       'x86 SMP support' and 'Voyager SMP support':
      
       config X86_SMP
      	bool
      	depends on SMP && ((X86_32 && !X86_VOYAGER) || X86_64)
      
      This is a pointless distinction - Voyager can (and already does) use
      smp_ops to implement various SMP quirks it has - and it can be extended
      more to cover all the specialities of Voyager.
      
      So remove this complication in the Kconfig space.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e5095d1
  21. 19 12月, 2008 1 次提交
  22. 17 12月, 2008 1 次提交
    • V
      x86: support always running TSC on Intel CPUs · 40fb1715
      Venki Pallipadi 提交于
      Impact: reward non-stop TSCs with good TSC-based clocksources, etc.
      
      Add support for CPUID_0x80000007_Bit8 on Intel CPUs as well. This bit means
      that the TSC is invariant with C/P/T states and always runs at constant
      frequency.
      
      With Intel CPUs, we have 3 classes
      * CPUs where TSC runs at constant rate and does not stop n C-states
      * CPUs where TSC runs at constant rate, but will stop in deep C-states
      * CPUs where TSC rate will vary based on P/T-states and TSC will stop in deep
        C-states.
      
      To cover these 3, one feature bit (CONSTANT_TSC) is not enough. So, add a
      second bit (NONSTOP_TSC). CONSTANT_TSC indicates that the TSC runs at
      constant frequency irrespective of P/T-states, and NONSTOP_TSC indicates
      that TSC does not stop in deep C-states.
      
      CPUID_0x8000000_Bit8 indicates both these feature bit can be set.
      We still have CONSTANT_TSC _set_ and NONSTOP_TSC _not_set_ on some older Intel
      CPUs, based on model checks. We can use TSC on such CPUs for time, as long as
      those CPUs do not support/enter deep C-states.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      40fb1715
  23. 26 11月, 2008 1 次提交
    • A
      tracing: add "power-tracer": C/P state tracer to help power optimization · f3f47a67
      Arjan van de Ven 提交于
      Impact: new "power-tracer" ftrace plugin
      
      This patch adds a C/P-state ftrace plugin that will generate
      detailed statistics about the C/P-states that are being used,
      so that we can look at detailed decisions that the C/P-state
      code is making, rather than the too high level "average"
      that we have today.
      
      An example way of using this is:
      
       mount -t debugfs none /sys/kernel/debug
       echo cstate > /sys/kernel/debug/tracing/current_tracer
       echo 1 > /sys/kernel/debug/tracing/tracing_enabled
       sleep 1
       echo 0 > /sys/kernel/debug/tracing/tracing_enabled
       cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f3f47a67
  24. 11 11月, 2008 1 次提交
    • I
      x86: call machine_shutdown and stop all CPUs in native_machine_halt · d3ec5cae
      Ivan Vecera 提交于
      Impact: really halt all CPUs on halt
      
      Function machine_halt (resp. native_machine_halt) is empty for x86
      architectures. When command 'halt -f' is invoked, the message "System
      halted." is displayed but this is not really true because all CPUs are
      still running.
      
      There are also similar inconsistencies for other arches (some uses
      power-off for halt or forever-loop with IRQs enabled/disabled).
      
      IMO there should be used the same approach for all architectures OR
      what does the message "System halted" really mean?
      
      This patch fixes it for x86.
      Signed-off-by: NIvan Vecera <ivecera@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d3ec5cae
  25. 23 9月, 2008 3 次提交
  26. 08 9月, 2008 1 次提交
  27. 28 8月, 2008 1 次提交
    • J
      x86: make poll_idle behave more like the other idle methods · 2c7e9fd4
      Joe Korty 提交于
      Make poll_idle() behave more like the other idle methods.
      
      Currently, poll_idle() returns immediately.  The other
      idle methods all wait indefinately for some condition
      to come true before returning.  poll_idle should emulate
      these other methods and also wait for a return condition,
      in this case, for need_resched() to become 'true'.
      
      Without this delay the idle loop spends all of its time
      in the outer loop that calls poll_idle.  This outer loop,
      these days, does real work, some of it under rcu locks.
      That work should only be done when idle is entered and
      when idle exits, not continuously while idle is spinning.
      Signed-off-by: NJoe Korty <joe.korty@ccur.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2c7e9fd4
  28. 19 7月, 2008 2 次提交
  29. 17 7月, 2008 2 次提交
  30. 09 7月, 2008 1 次提交
  31. 08 7月, 2008 2 次提交
    • T
      x86: add C1E aware idle function, fix · 0beefa20
      Thomas Gleixner 提交于
      On Tue, 17 Jun 2008, Rafael J. Wysocki wrote:
      >
      > BTW, with the C1E patches reverted I don't get the
      > WARNING: at /home/rafael/src/linux-next/kernel/smp.c:215 smp_call_function_single+0x3d/0xa2
      > in the log.  Thomas?
      
      The BROADCAST_FORCE notification uses smp_function_call and therefor
      must be run with interrupts enabled.
      
      While at it, add a comment for the BROADCAST_EXIT notifier as well.
      Reported-and-bisected-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0beefa20
    • T
      x86, clockevents: add C1E aware idle function · aa276e1c
      Thomas Gleixner 提交于
      C1E on AMD machines is like C3 but without control from the OS. Up to
      now we disabled the local apic timer for those machines as it stops
      when the CPU goes into C1E. This excludes those machines from high
      resolution timers / dynamic ticks, which hurts especially X2 based
      laptops.
      
      The current boot time C1E detection has another, more serious flaw
      as well: some BIOSes do not enable C1E until the ACPI processor module
      is loaded. This causes systems to stop working after that point.
      
      To work nicely with C1E enabled machines we use a separate idle
      function, which checks on idle entry whether C1E was enabled in the
      Interrupt Pending Message MSR. This allows us to do timer broadcasting
      for C1E and covers the late enablement of C1E as well.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aa276e1c