1. 11 2月, 2011 1 次提交
  2. 14 1月, 2011 1 次提交
  3. 12 1月, 2011 7 次提交
  4. 11 1月, 2011 3 次提交
  5. 10 1月, 2011 1 次提交
    • P
      x86, lapic-timer: Increase the max_delta to 31 bits · 4aed89d6
      Pierre Tardy 提交于
      Latest atom socs(penwell) does not have hpet timer.
      
      As their local APIC timer is clocked at 400KHZ, and the current
      code limit their Initial Counter register to 23 bits, they
      cannot sleep more than 1.34 seconds which leads to ~2 spurious
      wakeup per second (1 per thread)
      
      These SOCs support 32bit timer so we change the max_delta to at
      least 31bits. So we can at least sleep for 300 seconds.
      
      We could not find any previous chip errata where lapic would
      only have 23 bit precision As powertop is suggesting to activate
      HPET to "sleep longer", this could mean this problem is already
      known.
      
      Problem is here since very first implementation of lapic timer
      as a clock event e9e2cdb4 [PATCH] clockevents: i386 drivers.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NPierre Tardy <pierre.tardy@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Adrian Bunk <bunk@stusta.de>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Andi Kleen <ak@suse.de>
      LKML-Reference: <1294327409-19426-1-git-send-email-pierre.tardy@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4aed89d6
  6. 09 1月, 2011 2 次提交
    • C
      perf, x86: P4 PMU - Fix unflagged overflows handling · 047a3772
      Cyrill Gorcunov 提交于
      Don found that P4 PMU reads CCCR register instead of counter
      itself (in attempt to catch unflagged event) this makes P4
      NMI handler to consume all NMIs it observes. So the other
      NMI users such as kgdb simply have no chance to get NMI
      on their hands.
      
      Side note: at moment there is no way to run nmi-watchdog
      together with perf tool. This is because both 'perf top' and
      nmi-watchdog use same event. So while nmi-watchdog reserves
      one event/counter for own needs there is no room for perf tool
      left (there is a way to disable nmi-watchdog on boot of course).
      
      Ming has tested this patch with the following results
      
       | 1. watchdog disabled
       |
       | kgdb tests on boot OK
       | perf works OK
       |
       | 2. watchdog enabled, without patch perf-x86-p4-nmi-4
       |
       | kgdb tests on boot hang
       |
       | 3. watchdog enabled, without patch perf-x86-p4-nmi-4 and do not run kgdb
       | tests on boot
       |
       | "perf top" partialy works
       |   cpu-cycles            no
       |   instructions          yes
       |   cache-references      no
       |   cache-misses          no
       |   branch-instructions   no
       |   branch-misses         yes
       |   bus-cycles            no
       |
       | 4. watchdog enabled, with patch perf-x86-p4-nmi-4 applied
       |
       | kgdb tests on boot OK
       | perf does not work, NMI "Dazed and confused" messages show up
       |
      
      Which means we still have problems with p4 box due to 'unknown'
      nmi happens but at least it should fix kgdb test cases.
      Reported-by: NJason Wessel <jason.wessel@windriver.com>
      Reported-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4D275E7E.3040903@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      047a3772
    • R
      x86: Fix sparse non-ANSI function warnings in smpboot.c · 91d88ce2
      Randy Dunlap 提交于
      Fix sparse warning for non-ANSI function declaration:
      
        arch/x86/kernel/smpboot.c:100:30: warning: non-ANSI function declaration of function 'cpu_hotplug_driver_lock'
        arch/x86/kernel/smpboot.c:105:32: warning: non-ANSI function declaration of function 'cpu_hotplug_driver_unlock'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <20110108195914.95d366ea.randy.dunlap@oracle.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      91d88ce2
  7. 08 1月, 2011 1 次提交
    • F
      x86: Save rbp in pt_regs on irq entry · 625dbc3b
      Frederic Weisbecker 提交于
      From the x86_64 low level interrupt handlers, the frame pointer is
      saved right after the partial pt_regs frame.
      
      rbp is not supposed to be part of the irq partial saved registers,
      but it only requires to extend the pt_regs frame by 8 bytes to
      do so, plus a tiny stack offset fixup on irq exit.
      
      This changes a bit the semantics or get_irq_entry() that is supposed
      to provide only the value of caller saved registers and the cpu
      saved frame. However it's a win for unwinders that can walk through
      stack frames on top of get_irq_regs() snapshots.
      
      A noticeable impact is that it makes perf events cpu-clock and
      task-clock events based callchains working on x86_64.
      
      Let's then save rbp into the irq pt_regs.
      
      As a result with:
      
      	perf record -e cpu-clock perf bench sched messaging
      	perf report --stdio
      
      Before:
          20.94%             perf  [kernel.kallsyms]        [k] lock_acquire
                             |
                             --- lock_acquire
                                |
                                |--44.01%-- __write_nocancel
                                |
                                |--43.18%-- __read
                                |
                                |--6.08%-- fork
                                |          create_worker
                                |
                                |--0.88%-- _dl_fixup
                                |
                                |--0.65%-- do_lookup_x
                                |
                                |--0.53%-- __GI___libc_read
                                 --4.67%-- [...]
      
      After:
          19.23%         perf  [kernel.kallsyms]    [k] __lock_acquire
                         |
                         --- __lock_acquire
                            |
                            |--97.74%-- lock_acquire
                            |          |
                            |          |--21.82%-- _raw_spin_lock
                            |          |          |
                            |          |          |--37.26%-- unix_stream_recvmsg
                            |          |          |          sock_aio_read
                            |          |          |          do_sync_read
                            |          |          |          vfs_read
                            |          |          |          sys_read
                            |          |          |          system_call
                            |          |          |          __read
                            |          |          |
                            |          |          |--24.09%-- unix_stream_sendmsg
                            |          |          |          sock_aio_write
                            |          |          |          do_sync_write
                            |          |          |          vfs_write
                            |          |          |          sys_write
                            |          |          |          system_call
                            |          |          |          __write_nocancel
      
      v2: Fix cfi annotations.
      Reported-by: NSoeren Sandmann Pedersen <sandmann@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      625dbc3b
  8. 07 1月, 2011 8 次提交
  9. 05 1月, 2011 4 次提交
    • H
      x86, NMI: Add touch_nmi_watchdog to io_check_error delay · 74d91e3c
      Huang Ying 提交于
      Prevent the long delay in io_check_error making NMI watchdog
      timeout.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      LKML-Reference: <1294198689-15447-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      74d91e3c
    • D
      x86: Avoid calling arch_trigger_all_cpu_backtrace() at the same time · 554ec063
      Dongdong Deng 提交于
      The spin_lock_debug/rcu_cpu_stall detector uses
      trigger_all_cpu_backtrace() to dump cpu backtrace.
      Therefore it is possible that trigger_all_cpu_backtrace()
      could be called at the same time on different CPUs, which
      triggers and 'unknown reason NMI' warning. The following case
      illustrates the problem:
      
            CPU1                    CPU2                     ...   CPU N
                             trigger_all_cpu_backtrace()
                             set "backtrace_mask" to cpu mask
                                     |
      generate NMI interrupts  generate NMI interrupts       ...
          \                          |                               /
           \                         |                              /
      
      The "backtrace_mask" will be cleaned by the first NMI interrupt
      at nmi_watchdog_tick(), then the following NMI interrupts
      generated by other cpus's arch_trigger_all_cpu_backtrace() will
      be taken as unknown reason NMI interrupts.
      
      This patch uses a test_and_set to avoid the problem, and stop
      the arch_trigger_all_cpu_backtrace() from calling to avoid
      dumping a double cpu backtrace info when there is already a
      trigger_all_cpu_backtrace() in progress.
      Signed-off-by: NDongdong Deng <dongdong.deng@windriver.com>
      Reviewed-by: NBruce Ashfield <bruce.ashfield@windriver.com>
      Cc: fweisbec@gmail.com
      LKML-Reference: <1294198689-15447-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      554ec063
    • D
      x86: Only call smp_processor_id in non-preempt cases · 9ab181fa
      Don Zickus 提交于
      There are some paths that walk the die_chain with preemption on.
      Make sure we are in an NMI call before we start doing anything.
      
      This was triggered by do_general_protection calling notify_die
      with DIE_GPF.
      Reported-by: NJan Kiszka <jan.kiszka@web.de>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      LKML-Reference: <1294198689-15447-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9ab181fa
    • Y
      x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion · cb2ded37
      Yinghai Lu 提交于
      Found one x2apic pre-enabled system, x2apic_mode suddenly get
      corrupted after register some cpus, when compiled
      CONFIG_NR_CPUS=255 instead of 512.
      
      It turns out that generic_processor_info() ==> phyid_set(apicid,
      phys_cpu_present_map) causes the problem.
      
      phys_cpu_present_map is sized by MAX_APICS bits, and pre-enabled
      system some cpus have an apic id > 255.
      
      The variable after phys_cpu_present_map may get corrupted
      silently:
      
       ffffffff828e8420 B phys_cpu_present_map
       ffffffff828e8440 B apic_verbosity
       ffffffff828e8444 B local_apic_timer_c2_ok
       ffffffff828e8448 B disable_apic
       ffffffff828e844c B x2apic_mode
       ffffffff828e8450 B x2apic_disabled
       ffffffff828e8454 B num_processors
       ...
      
      Actually phys_cpu_present_map is referenced via apic id, instead
      index. We should use MAX_LOCAL_APIC instead MAX_APICS.
      
      For 64-bit it will be 32768 in all cases. BSS will increase by 4k bytes
      on 64-bit:
      
      	text		data		bss		dec		filename
      	21696943	4193748		12787712	38678403	vmlinux.before
      	21696943	4193748		12791808	38682499	vmlinux.after
      
      No change on 32bit.
      
      Finally we can remove MAX_APCIS that was rather confusing.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      LKML-Reference: <4D23BD9C.3070102@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cb2ded37
  10. 04 1月, 2011 3 次提交
    • R
      x86, mm: Initialize initial_page_table before paravirt jumps · d50d8fe1
      Rusty Russell 提交于
      v2.6.36-rc8-54-gb40827fa (x86-32, mm: Add an initial page table
      for core bootstrapping) made x86 boot using initial_page_table
      and broke lguest.
      
      For 2.6.37 we simply cut & paste the initialization code into
      lguest (da32dac1 "lguest: populate initial_page_table"), now
      we fix it properly by doing that initialization before the
      paravirt jump.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Cc: lguest <lguest@ozlabs.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <201101041720.54535.rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d50d8fe1
    • T
      perf: Clean up power events by introducing new, more generic ones · 25e41933
      Thomas Renninger 提交于
      Add these new power trace events:
      
       power:cpu_idle
       power:cpu_frequency
       power:machine_suspend
      
      The old C-state/idle accounting events:
        power:power_start
        power:power_end
      
      Have now a replacement (but we are still keeping the old
      tracepoints for compatibility):
      
        power:cpu_idle
      
      and
        power:power_frequency
      
      is replaced with:
        power:cpu_frequency
      
      power:machine_suspend is newly introduced.
      
      Jean Pihet has a patch integrated into the generic layer
      (kernel/power/suspend.c) which will make use of it.
      
      the type= field got removed from both, it was never
      used and the type is differed by the event type itself.
      
      perf timechart userspace tool gets adjusted in a separate patch.
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NJean Pihet <jean.pihet@newoldbits.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: rjw@sisk.pl
      LKML-Reference: <1294073445-14812-3-git-send-email-trenn@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>
      25e41933
    • R
      x86, hwmon: Add core threshold notification to therm_throt.c · 9e76a97e
      R, Durgadoss 提交于
      This patch adds code to therm_throt.c to notify core thermal threshold
      events. These thresholds are supported by the IA32_THERM_INTERRUPT register.
      The status/log for the same is monitored using the IA32_THERM_STATUS register.
      The necessary #defines are in msr-index.h. A call back is added to mce.h, to
      further notify the thermal stack, about the threshold events.
      Signed-off-by: NDurgadoss R <durgadoss.r@intel.com>
      LKML-Reference: <D6D887BA8C9DFF48B5233887EF04654105C1251710@bgsmsx502.gar.corp.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      9e76a97e
  11. 30 12月, 2010 3 次提交
  12. 27 12月, 2010 1 次提交
    • J
      x86/microcode: Fix double vfree() and remove redundant pointer checks before vfree() · 5cdd2de0
      Jesper Juhl 提交于
      In arch/x86/kernel/microcode_intel.c::generic_load_microcode()
      we have  this:
      
      	while (leftover) {
      		...
      		if (get_ucode_data(mc, ucode_ptr, mc_size) ||
      		    microcode_sanity_check(mc) < 0) {
      			vfree(mc);
      			break;
      		}
      		...
      	}
      
      	if (mc)
      		vfree(mc);
      
      This will cause a double free of 'mc'. This patch fixes that by
      just  removing the vfree() call in the loop since 'mc' will be
      freed nicely just  after we break out of the loop.
      
      There's also a second change in the patch. I noticed a lot of
      checks for  pointers being NULL before passing them to vfree().
      That's completely  redundant since vfree() deals gracefully with
      being passed a NULL pointer.  Removing the redundant checks
      yields a nice size decrease for the object  file.
      
      Size before the patch:
         text    data     bss     dec     hex filename
         4578     240    1032    5850    16da arch/x86/kernel/microcode_intel.o
      Size after the patch:
         text    data     bss     dec     hex filename
         4489     240     984    5713    1651 arch/x86/kernel/microcode_intel.o
      Signed-off-by: NJesper Juhl <jj@chaosbits.net>
      Acked-by: NTigran Aivazian <tigran@aivazian.fsnet.co.uk>
      Cc: Shaohua Li <shaohua.li@intel.com>
      LKML-Reference: <alpine.LNX.2.00.1012251946100.10759@swampdragon.chaosbits.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5cdd2de0
  13. 24 12月, 2010 2 次提交
    • Y
      x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation · d3bd0588
      Yinghai Lu 提交于
      Recent Intel new system have different order in MADT, aka will list all thread0
      at first, then all thread1.
      But SRAT table still old order, it will list cpus in one socket all together.
      
      If the user have compiled limited NR_CPUS or boot with nr_cpus=, could have missed
      to put some cpus apic id to node mapping into apicid_to_node[].
      
      for example for 4 sockets system with 64 cpus with nr_cpus=32 will get crash...
      
      [    9.106288] Total of 32 processors activated (136190.88 BogoMIPS).
      [    9.235021] divide error: 0000 [#1] SMP
      [    9.235315] last sysfs file:
      [    9.235481] CPU 1
      [    9.235592] Modules linked in:
      [    9.245398]
      [    9.245478] Pid: 2, comm: kthreadd Not tainted 2.6.37-rc1-tip-yh-01782-ge92ef79-dirty #274      /Sun Fire x4800
      [    9.265415] RIP: 0010:[<ffffffff81075a8f>]  [<ffffffff81075a8f>] select_task_rq_fair+0x4f0/0x623
      ...
      [    9.645938] RIP  [<ffffffff81075a8f>] select_task_rq_fair+0x4f0/0x623
      [    9.665356]  RSP <ffff88103f8d1c40>
      [    9.665568] ---[ end trace 2296156d35fdfc87 ]---
      
      So let just parse all cpu entries in SRAT.
      
      Also add apicid checking with MAX_LOCAL_APIC, in case We could out of boundaries of
      apicid_to_node[].
      
      it fixes following bug too.
      https://bugzilla.kernel.org/show_bug.cgi?id=22662
      
      -v2: expand to 32bit according to hpa
         need to add MAX_LOCAL_APIC for 32bit
      Reported-and-Tested-by: NWu Fengguang <fengguang.wu@intel.com>
      Reported-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Tested-by: NMyron Stowe <myron.stowe@hp.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4D0AD486.9020704@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      d3bd0588
    • Y
      x86, acpi: Add MAX_LOCAL_APIC for 32bit · 56d91f13
      Yinghai Lu 提交于
      We should use MAX_LOCAL_APIC for max apic ids and MAX_APICS as number
      of local apics.
      
      Also apic_version[] array should use MAX_LOCAL_APICs.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4D0AD464.2020408@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      56d91f13
  14. 23 12月, 2010 1 次提交
    • D
      x86, nmi_watchdog: Remove ARCH_HAS_NMI_WATCHDOG and rely on CONFIG_HARDLOCKUP_DETECTOR · 4a7863cc
      Don Zickus 提交于
      The x86 arch has shifted its use of the nmi_watchdog from a
      local implementation to the global one provide by
      kernel/watchdog.c.  This shift has caused a whole bunch of
      compile problems under different config options.  I attempt to
      simplify things with the patch below.
      
      In order to simplify things, I had to come to terms with the
      meaning of two terms ARCH_HAS_NMI_WATCHDOG and
      CONFIG_HARDLOCKUP_DETECTOR.  Basically they mean the same thing,
      the former on a local level and the latter on a global level.
      
      With the old x86 nmi watchdog gone, there is no need to rely on
      defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't
      make sense any more.  x86 will now use the global
      implementation.
      
      The changes below do a few things.  First it changes the few
      places that relied on ARCH_HAS_NMI_WATCHDOG to use
      CONFIG_X86_LOCAL_APIC (the former was an alias for the latter
      anyway, so nothing unusual here).  Those pieces of code were
      relying more on local apic functionality the nmi watchdog
      functionality, so the change should make sense.
      
      Second, I removed the x86 implementation of
      touch_nmi_watchdog().  It isn't need now, instead x86 will rely
      on kernel/watchdog.c's implementation.
      
      Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from
      x86.  And tweaked the include/linux/nmi.h file to tell users to
      look for an externally defined touch_nmi_watchdog in the case of
      ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This
      changes removes some of the ugliness in that file.
      
      Finally, I added a Kconfig dependency for
      CONFIG_HARDLOCKUP_DETECTOR that said you can't have
      ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR.  You can
      only have one nmi_watchdog.
      
      Tested with
      ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken
      configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig,
      (various broken configs)
      
      Hopefully, after this patch I won't get any more compile broken
      emails. :-)
      
      v3:
        changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function
        prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <1293044403-14117-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4a7863cc
  15. 22 12月, 2010 2 次提交