1. 16 9月, 2010 1 次提交
  2. 06 8月, 2010 1 次提交
    • E
      x86, apic: Map the local apic when parsing the MP table. · 5989cd6a
      Eric W. Biederman 提交于
      This fixes a regression in 2.6.35 from 2.6.34, that is
      present for select models of Intel cpus when people are
      using an MP table.
      
      The commit cf7500c0
      "x86, ioapic: In mpparse use mp_register_ioapic" started
      calling mp_register_ioapic from MP_ioapic_info.  An extremely
      simple change that was obviously correct.  Unfortunately
      mp_register_ioapic did just a little more than the previous
      hand crafted code and so we gained this call path.
      
      The problem call path is:
      MP_ioapic_info()
        mp_register_ioapic()
         io_apic_unique_id()
           io_apic_get_unique_id()
             get_physical_broadcast()
               modern_apic()
                 lapic_get_version()
                   apic_read(APIC_LVR)
      
      Which turned out to be a problem because the local apic
      was not mapped, at that point, unlike the similar point
      in the ACPI parsing code.
      
      This problem is fixed by mapping the local apic when
      parsing the mptable as soon as we reasonably can.
      
      Looking at the number of places we setup the fixmap for
      the local apic, I see some serious simplification opportunities.
      For the moment except for not duplicating the setting up of the
      fixmap in init_apic_mappings, I have not acted on them.
      
      The regression from 2.6.34 is tracked in bug
      https://bugzilla.kernel.org/show_bug.cgi?id=16173
      
      Cc: <stable@kernel.org> 2.6.35
      Reported-by: NDavid Hill <hilld@binarystorm.net>
      Reported-by: NTvrtko Ursulin <tvrtko.ursulin@sophos.com>
      Tested-by: NTvrtko Ursulin <tvrtko.ursulin@sophos.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <m1eiee86jg.fsf_-_@fess.ebiederm.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5989cd6a
  3. 17 7月, 2010 1 次提交
    • Y
      x86: Fix x2apic preenabled system with kexec · fd19dce7
      Yinghai Lu 提交于
      Found one x2apic system kexec loop test failed
      when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip)
      
      first kernel can kexec second kernel, but second kernel can not kexec third one.
      
      it can be duplicated on another system with BIOS preenabled x2apic.
      First kernel can not kexec second kernel.
      
      It turns out, when kernel boot with pre-enabled x2apic, it will not execute
      disable_local_APIC on shutdown path.
      
      when init_apic_mappings() is called in setup_arch, it will skip setting of
      apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic())
      Then later, disable_local_APIC() will bail out early because !apic_phys.
      
      So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys.
      
      another solution could be updating init_apic_mappings() to set apic_phys even
      for preenabled x2apic system. Actually even for x2apic system, that lapic
      address is mapped already in early stage.
      
      BTW: is there any x2apic preenabled system with apicid of boot cpu > 255?
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4C3EB22B.3000701@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: stable@kernel.org
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      fd19dce7
  4. 17 6月, 2010 1 次提交
  5. 25 5月, 2010 1 次提交
    • K
      x86, apic: ack all pending irqs when crashed/on kexec · 8c3ba8d0
      Kerstin Jonsson 提交于
      When the SMP kernel decides to crash_kexec() the local APICs may have
      pending interrupts in their vector tables.
      
      The setup routine for the local APIC has a deficient mechanism for
      clearing these interrupts, it only handles interrupts that has already
      been dispatched to the local core for servicing (the ISR register) safely,
      it doesn't consider lower prioritized queued interrupts stored in the IRR
      register.
      
      If you have more than one pending interrupt within the same 32 bit word in
      the LAPIC vector table registers you may find yourself entering the IO
      APIC setup with pending interrupts left in the LAPIC.  This is a situation
      for wich the IO APIC setup is not prepared.  Depending of what/which
      interrupt vector/vectors are stuck in the APIC tables your system may show
      various degrees of malfunctioning.  That was the reason why the
      check_timer() failed in our system, the timer interrupts was blocked by
      pending interrupts from the old kernel when routed trough the IO APIC.
      
      Additional comment from Jiri Bohac:
      ==============
      If this should go into stable release,
      I'd add some kind of limit on the number of iterations, just to be safe from
      hard to debug lock-ups:
      
      +if (loops++  > MAX_LOOPS) {
      +        printk("LAPIC pending clean-up")
      +        break;
      +}
       while (queued);
      
      with MAX_LOOPS something like 1E9 this would leave plenty of time for the
      pending IRQs to be cleared and would and still cause at most a second of delay
      if the loop were to lock-up for whatever reason.
      
      [trenn@suse.de:
      
      V2: Use tsc if avail to bail out after 1 sec due to possible virtual
          apic_read calls which may take rather long (suggested by: Avi Kivity
          <avi@redhat.com>) If no tsc is available bail out quickly after
          cpu_khz, if we broke out too early and still have irqs pending (which
          should never happen?) we still get a WARN_ON...
      
      V3: - Fixed indentation -> checkpatch clean
          - max_loops must be signed
      
      V4: - Fix typo, mixed up tsc and ntsc in first rdtscll() call
      
      V5: Adjust WARN_ON() condition to also catch error in cpu_has_tsc case]
      
      Cc: <jbohac@novell.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Tested-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      LKML-Reference: <201005241913.o4OJDGWM010865@imap1.linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      8c3ba8d0
  6. 03 4月, 2010 1 次提交
    • S
      x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards · 472a474c
      Suresh Siddha 提交于
      Jan Grossmann reported kernel boot panic while booting SMP
      kernel on his system with a single core cpu. SMP kernels call
      enable_IR_x2apic() from native_smp_prepare_cpus() and on
      platforms where the kernel doesn't find SMP configuration we
      ended up again calling enable_IR_x2apic() from the
      APIC_init_uniprocessor() call in the smp_sanity_check(). Thus
      leading to kernel panic.
      
      Don't call enable_IR_x2apic() and default_setup_apic_routing()
      from APIC_init_uniprocessor() in CONFIG_SMP case.
      
      NOTE: this kind of non-idempotent and assymetric initialization
      sequence is rather fragile and unclean, we'll clean that up
      in v2.6.35. This is the minimal fix for v2.6.34.
      
      Reported-by: Jan.Grossmann@kielnet.net
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <jbarnes@virtuousgeek.org>
      Cc: <david.woodhouse@intel.com>
      Cc: <weidong.han@intel.com>
      Cc: <youquan.song@intel.com>
      Cc: <Jan.Grossmann@kielnet.net>
      Cc: <stable@kernel.org> # [v2.6.32.x, v2.6.33.x]
      LKML-Reference: <1270083887.7835.78.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      472a474c
  7. 20 2月, 2010 1 次提交
  8. 10 2月, 2010 1 次提交
    • S
      x86, apic: Don't use logical-flat mode when CPU hotplug may exceed 8 CPUs · 681ee44d
      Suresh Siddha 提交于
      We need to fall back from logical-flat APIC mode to physical-flat mode
      when we have more than 8 CPUs.  However, in the presence of CPU
      hotplug(with bios listing not enabled but possible cpus as disabled cpus in
      MADT), we have to consider the number of possible CPUs rather than
      the number of current CPUs; otherwise we may cross the 8-CPU boundary
      when CPUs are added later.
      
      32bit apic code can use more cleanups (like the removal of vendor checks in
      32bit default_setup_apic_routing()) and more unifications with 64bit code.
      Yinghai has some patches in works already. This patch addresses the boot issue
      that is reported in the virtualization guest context.
      
      [ hpa: incorporated function annotation feedback from Yinghai Lu ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1265767304.2833.19.camel@sbs-t61.sc.intel.com>
      Acked-by: NShaohui Zheng <shaohui.zheng@intel.com>
      Reviewed-by: NYinghai Lu <yinghai@kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      681ee44d
  9. 08 2月, 2010 1 次提交
  10. 19 1月, 2010 1 次提交
  11. 12 1月, 2010 1 次提交
  12. 12 12月, 2009 1 次提交
  13. 23 11月, 2009 1 次提交
  14. 16 11月, 2009 1 次提交
  15. 27 10月, 2009 1 次提交
  16. 15 10月, 2009 1 次提交
    • C
      x86: apic: Allow noop operations to be called almost at any time · f88f2b4f
      Cyrill Gorcunov 提交于
      As only apic noop is used we allow to use almost any operation
      caller wants (and which of them noop driver supports of
      course).
      
      Initially it was reported by Ingo Molnar that apic noop
      issue a warning for pkg id (which is actually false positive
      and should be eliminated).
      
      So we save checking (and warning issue) for read/write
      operations while allow any other ops to be freely used.
      
      Also:
       - fix noop_cpu_to_logical_apicid, it should be 0.
       - rename noop_default_phys_pkg_id to noop_phys_pkg_id
         (we use default_ prefix for more general routines
          in apic subsystem).
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      LKML-Reference: <20091015150416.GC5331@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f88f2b4f
  17. 14 10月, 2009 1 次提交
  18. 21 9月, 2009 2 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
    • C
      x86, apic: Fix missed handling of discrete apics · 8312136f
      Cyrill Gorcunov 提交于
      In case of discrete (pretty old) apics we may have cpu_has_apic bit
      not set but have to check if smp_found_config (MP spec) is there
      and apic was not disabled.
      
      Also don't forget to print apic/io-apic for such case as well.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20090915071230.GA10604@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8312136f
  19. 19 9月, 2009 1 次提交
    • S
      x86, apic: Use logical flat on intel with <= 8 logical cpus · 2fbd07a5
      Suresh Siddha 提交于
      On Intel platforms, we can use logical flat mode if there are <= 8
      logical cpu's (irrespective of physical apic id values). This will
      enable simplified and efficient IPI and device interrupt routing on
      such platforms.
      
      Fix the relevant comments while we are at it.
      
      We can clean up default_setup_apic_routing() by using apic->probe()
      but that is a different item.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: "yinghai@kernel.org" <yinghai@kernel.org>
      LKML-Reference: <1253327399.3948.747.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2fbd07a5
  20. 18 9月, 2009 1 次提交
  21. 31 8月, 2009 1 次提交
  22. 27 8月, 2009 1 次提交
  23. 18 8月, 2009 1 次提交
    • Y
      x86, apic: Move dmar_table_init() out of enable_IR() · b7f42ab2
      Yinghai Lu 提交于
      On an x2apic system, we got:
      
      [    1.818072] ------------[ cut here ]------------
      [    1.820376] WARNING: at kernel/lockdep.c:2461 lockdep_trace_alloc+0xa5/0xe9()
      [    1.835282] Hardware name: ASSY,
      [    1.839006] Modules linked in:
      [    1.841253] Pid: 1, comm: swapper Not tainted 2.6.31-rc5-tip-03926-g39aaa80-dirty #510
      [    1.858056] Call Trace:
      [    1.859913]  [<ffffffff810d13aa>] ? lockdep_trace_alloc+0xa5/0xe9
      [    1.876270]  [<ffffffff81093f37>] warn_slowpath_common+0x8d/0xd0
      [    1.879132]  [<ffffffff81093fa1>] warn_slowpath_null+0x27/0x3d
      [    1.896823]  [<ffffffff810d13aa>] lockdep_trace_alloc+0xa5/0xe9
      [    1.900659]  [<ffffffff810cf5a0>] ? lock_release_holdtime+0x2f/0x199
      [    1.917188]  [<ffffffff81167a3c>] kmem_cache_alloc_notrace+0x42/0x111
      [    1.922320]  [<ffffffff8106fe8c>] ? reserve_memtype+0x152/0x518
      [    1.938137]  [<ffffffff8106f8b1>] ? pat_pagerange_is_ram+0x4a/0x91
      [    1.941730]  [<ffffffff8106fe8c>] reserve_memtype+0x152/0x518
      [    1.958115]  [<ffffffff8106ce62>] __ioremap_caller+0x1dd/0x30f
      [    1.975507]  [<ffffffff81ce2c5c>] ? acpi_os_map_memory+0x2a/0x47
      [    1.978987]  [<ffffffff8106d0fd>] ioremap_nocache+0x2a/0x40
      [    2.031400]  [<ffffffff810d0364>] ? trace_hardirqs_off+0x20/0x36
      [    2.036096]  [<ffffffff81ce2c5c>] acpi_os_map_memory+0x2a/0x47
      [    2.046263]  [<ffffffff815cd642>] acpi_tb_verify_table+0x3d/0x85
      [    2.050349]  [<ffffffff81d34af7>] ? _spin_unlock_irqrestore+0x50/0x76
      [    2.067327]  [<ffffffff815ccad6>] acpi_get_table_with_size+0x64/0xd9
      [    2.070860]  [<ffffffff81d34af7>] ? _spin_unlock_irqrestore+0x50/0x76
      [    2.088000]  [<ffffffff825c88d5>] dmar_table_detect+0x33/0x70
      [    2.092047]  [<ffffffff825c8a01>] dmar_table_init+0x43/0x428
      [    2.106854]  [<ffffffff825a7537>] enable_IR+0x1c/0x8d
      [    2.110256]  [<ffffffff825a7624>] enable_IR_x2apic+0x7c/0x19e
      [    2.127139]  [<ffffffff825a4876>] native_smp_prepare_cpus+0x139/0x3b8
      [    2.145175]  [<ffffffff8259678d>] kernel_init+0x71/0x1da
      [    2.148913]  [<ffffffff8104305a>] child_rip+0xa/0x20
      [    2.152349]  [<ffffffff810429fc>] ? restore_args+0x0/0x30
      [    2.167931]  [<ffffffff8259671c>] ? kernel_init+0x0/0x1da
      [    2.171671]  [<ffffffff81043050>] ? child_rip+0x0/0x20
      [    2.187607] ---[ end trace a7919e7f17c0a725 ]---
      
      Venkatesh Pallipadi said:
      
      | Looks like the problem started with this commit
      |
      | commit ce69a784
      | Author: Gleb Natapov <gleb@redhat.com>
      | Date:   Mon Jul 20 15:24:17 2009 +0300
      |
      | x86/apic: Enable x2APIC without interrupt remapping under KVM
      |
      | Before this commit, dmar_table_init() was getting called
      | with interrupts enabled and after this commit, it is getting
      | called with interrupts disabled.
      
      so try to move out dmar_table_init out of that function.
      Analyzed-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
      LKML-Reference: <4A899F3C.2050104@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7f42ab2
  24. 05 8月, 2009 2 次提交
    • G
      x86/apic: Enable x2APIC without interrupt remapping under KVM · ce69a784
      Gleb Natapov 提交于
      KVM would like to provide x2APIC interface to a guest without emulating
      interrupt remapping device. The reason KVM prefers guest to use x2APIC
      is that x2APIC interface is better virtualizable and provides better
      performance than mmio xAPIC interface:
      
       - msr exits are faster than mmio (no page table walk, emulation)
       - no need to read back ICR to look at the busy bit
       - one 64 bit ICR write instead of two 32 bit writes
       - shared code with the Hyper-V paravirt interface
      
      Included patch changes x2APIC enabling logic to enable it even if IR
      initialization failed, but kernel runs under KVM and no apic id is
      greater than 255 (if there is one spec requires BIOS to move to x2apic
      mode before starting an OS).
      
      -v2: fix build
      -v3: fix bug causing compiler warning
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Sheng Yang <sheng@linux.intel.com>
      Cc: "avi@redhat.com" <avi@redhat.com>
      LKML-Reference: <20090720122417.GR5638@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ce69a784
    • C
      x86, apic: Drop redundant bit assignment · 9910887a
      Cyrill Gorcunov 提交于
      cpu_has_apic has already investigated boot_cpu_data
      X86_FEATURE_APIC bit for being clear if condition is
      triggered.
      
      So there is no need to clear this bit second time.
      Signed-off-by: NCyrill Gorcuno v <gorcunov@openvz.org>
      Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
      LKML-Reference: <20090722205259.GE15805@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9910887a
  25. 03 7月, 2009 1 次提交
  26. 07 6月, 2009 1 次提交
    • C
      x86, apic: Fix dummy apic read operation together with broken MP handling · 103428e5
      Cyrill Gorcunov 提交于
      Ingo Molnar reported that read_apic is buggy novadays:
      
      [    0.000000] Using APIC driver default
      [    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
      [    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
      [    0.000000] APIC: disable apic facility
      [    0.000000] ------------[ cut here ]------------
      [    0.000000] WARNING: at arch/x86/kernel/apic/apic.c:254 native_apic_read_dummy+0x2d/0x3b()
      [    0.000000] Hardware name: HP OmniBook PC
      
      Indeed we still rely on apic->read operation for SMP compiled
      kernel. And instead of disfigure the SMP code with #ifdef we
      allow to call apic->read. To capture any unexpected results
      we check for apic->read being called for sane reason via
      WARN_ON_ONCE but(!) instead of OR we should use AND logical
      operation (thanks Yinghai for spotting the root of the problem).
      
      Along with that we could be have bad MP table and we are
      to fix it that way no SMP started and no complains about
      BIOS bug if apic was just disabled via command line.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20090607124840.GD4547@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      103428e5
  27. 02 6月, 2009 1 次提交
  28. 29 5月, 2009 2 次提交
    • Y
      perf_counter/x86: Always use NMI for performance-monitoring interrupt · c323d95f
      Yong Wang 提交于
      Always use NMI for performance-monitoring interrupt as there could be
      racy situations if we switch between irq and nmi mode frequently.
      Signed-off-by: NYong Wang <yong.y.wang@intel.com>
      LKML-Reference: <20090529052835.GA13657@ywang-moblin2.bj.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c323d95f
    • A
      x86, mce: use 64bit machine check code on 32bit · 4efc0670
      Andi Kleen 提交于
      The 64bit machine check code is in many ways much better than
      the 32bit machine check code: it is more specification compliant,
      is cleaner, only has a single code base versus one per CPU,
      has better infrastructure for recovery, has a cleaner way to communicate
      with user space etc. etc.
      
      Use the 64bit code for 32bit too.
      
      This is the second attempt to do this. There was one a couple of years
      ago to unify this code for 32bit and 64bit.  Back then this ran into some
      trouble with K7s and was reverted.
      
      I believe this time the K7 problems (and some others) are addressed.
      I went over the old handlers and was very careful to retain
      all quirks.
      
      But of course this needs a lot of testing on old systems. On newer
      64bit capable systems I don't expect much problems because they have been
      already tested with the 64bit kernel.
      
      I made this a CONFIG for now that still allows to select the old
      machine check code. This is mostly to make testing easier,
      if someone runs into a problem we can ask them to try
      with the CONFIG switched.
      
      The new code is default y for more coverage.
      
      Once there is confidence the 64bit code works well on older hardware
      too the CONFIG_X86_OLD_MCE and the associated code can be easily
      removed.
      
      This causes a behaviour change for 32bit installations. They now
      have to install the mcelog package to be able to log
      corrected machine checks.
      
      The 64bit machine check code only handles CPUs which support the
      standard Intel machine check architecture described in the IA32 SDM.
      The 32bit code has special support for some older CPUs which
      have non standard machine check architectures, in particular
      WinChip C3 and Intel P5.  I made those a separate CONFIG option
      and kept them for now. The WinChip variant could be probably
      removed without too much pain, it doesn't really do anything
      interesting. P5 is also disabled by default (like it
      was before) because many motherboards have it miswired, but
      according to Alan Cox a few embedded setups use that one.
      
      Forward ported/heavily changed version of old patch, original patch
      included review/fixes from Thomas Gleixner, Bert Wesarg.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4efc0670
  29. 26 5月, 2009 1 次提交
  30. 22 5月, 2009 1 次提交
    • P
      perf_counter: Dynamically allocate tasks' perf_counter_context struct · a63eaf34
      Paul Mackerras 提交于
      This replaces the struct perf_counter_context in the task_struct with
      a pointer to a dynamically allocated perf_counter_context struct.  The
      main reason for doing is this is to allow us to transfer a
      perf_counter_context from one task to another when we do lazy PMU
      switching in a later patch.
      
      This has a few side-benefits: the task_struct becomes a little smaller,
      we save some memory because only tasks that have perf_counters attached
      get a perf_counter_context allocated for them, and we can remove the
      inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't
      end up recompiling nearly everything whenever perf_counter.h changes.
      
      The perf_counter_context structures are reference-counted and freed
      when the last reference is dropped.  A context can have references
      from its task and the counters on its task.  Counters can outlive the
      task so it is possible that a context will be freed well after its
      task has exited.
      
      Contexts are allocated on fork if the parent had a context, or
      otherwise the first time that a per-task counter is created on a task.
      In the latter case, we set the context pointer in the task struct
      locklessly using an atomic compare-and-exchange operation in case we
      raced with some other task in creating a context for the subject task.
      
      This also removes the task pointer from the perf_counter struct.  The
      task pointer was not used anywhere and would make it harder to move a
      context from one task to another.  Anything that needed to know which
      task a counter was attached to was already using counter->ctx->task.
      
      The __perf_counter_init_context function moves up in perf_counter.c
      so that it can be called from find_get_context, and now initializes
      the refcount, but is otherwise unchanged.
      
      We were potentially calling list_del_counter twice: once from
      __perf_counter_exit_task when the task exits and once from
      __perf_counter_remove_from_context when the counter's fd gets closed.
      This adds a check in list_del_counter so it doesn't do anything if
      the counter has already been removed from the lists.
      
      Since perf_counter_task_sched_in doesn't do anything if the task doesn't
      have a context, and leaves cpuctx->task_ctx = NULL, this adds code to
      __perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in
      the case where the current task adds the first counter to itself and
      thus creates a context for itself.
      
      This also adds similar code to __perf_counter_enable to handle a
      similar situation which can arise when the counters have been disabled
      using prctl; that also leaves cpuctx->task_ctx = NULL.
      
      [ Impact: refactor counter context management to prepare for new feature ]
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a63eaf34
  31. 12 5月, 2009 1 次提交
    • Y
      x86: read apic ID in the !acpi_lapic case · 4797f6b0
      Yinghai Lu 提交于
      Ed found that on 32-bit, boot_cpu_physical_apicid is not read right,
      when the mptable is broken.
      
      Interestingly, actually three paths use/set it:
      
       1. acpi: at that time that is already read from reg
       2. mptable: only read from mptable
       3. no madt, and no mptable, that use default apic id 0 for 64-bit, -1 for 32-bit
      
      so we could read the apic id for the 2/3 path. We trust the hardware
      register more than we trust a BIOS data structure (the mptable).
      
      We can also avoid the double set_fixmap() when acpi_lapic
      is used, and also need to move cpu_has_apic earlier and
      call apic_disable().
      
      Also when need to update the apic id, we'd better read and
      set the apic version as well - so that quirks are applied precisely.
      
      v2: make path 3 with 64bit, use -1 as apic id, so could read it later.
      v3: fix whitespace problem pointed out by Ed Swierk
      v5: fix boot crash
      
      [ Impact: get correct apic id for bsp other than acpi path ]
      Reported-by: NEd Swierk <eswierk@aristanetworks.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <49FC85A9.2070702@kernel.org>
      [ v4: sanity-check in the ACPI case too ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4797f6b0
  32. 11 5月, 2009 3 次提交
    • C
      x86: apic: Fixmap apic address even if apic disabled · cec6be6d
      Cyrill Gorcunov 提交于
      In case if apic were disabled by boot option
      we still need read_apic operation. So fixmap
      a fake apic area if needed.
      
      [ Impact: fix boot crash ]
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: yinghai@kernel.org
      Cc: eswierk@aristanetworks.com
      LKML-Reference: <20090511134140.GH4624@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cec6be6d
    • A
      x86: display extended apic registers with print_local_APIC and cpu_debug code · 97a52714
      Andreas Herrmann 提交于
      Both print_local_APIC (used when apic=debug kernel param is set) and
      cpu_debug code missed support for some extended APIC registers that
      I'd like to see.
      
      This adds support to show:
      
       - extended APIC feature register
       - extended APIC control register
       - extended LVT registers
      
      [ Impact: print more debug info ]
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <20090508162350.GO29045@alberich.amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      97a52714
    • Y
      x86: read apic ID in the !acpi_lapic case · 4401da61
      Yinghai Lu 提交于
      Ed found that on 32-bit, boot_cpu_physical_apicid is not read right,
      when the mptable is broken.
      
      Interestingly, actually three paths use/set it:
      
       1. acpi: at that time that is already read from reg
       2. mptable: only read from mptable
       3. no madt, and no mptable, that use default apic id 0 for 64-bit, -1 for 32-bit
      
      so we could read the apic id for the 2/3 path. We trust the hardware
      register more than we trust a BIOS data structure (the mptable).
      
      We can also avoid the double set_fixmap() when acpi_lapic
      is used, and also need to move cpu_has_apic earlier and
      call apic_disable().
      
      Also when need to update the apic id, we'd better read and
      set the apic version as well - so that quirks are applied precisely.
      
      v2: make path 3 with 64bit, use -1 as apic id, so could read it later.
      v3: fix whitespace problem pointed out by Ed Swierk
      
      [ Impact: get correct apic id for bsp other than acpi path ]
      Reported-by: NEd Swierk <eswierk@aristanetworks.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <49FC85A9.2070702@kernel.org>
      [ v4: sanity-check in the ACPI case too ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4401da61
  33. 02 5月, 2009 1 次提交
  34. 27 4月, 2009 1 次提交
  35. 22 4月, 2009 1 次提交
    • S
      x86: x2apic, IR: remove reinit_intr_remapped_IO_APIC() · ff166cb5
      Suresh Siddha 提交于
      When interrupt-remapping is enabled, we are relying on
      setup_IO_APIC_irqs() to configure remapped entries in the
      IO-APIC, which comes little bit later after enabling
      interrupt-remapping.
      
      Meanwhile, restoration of old io-apic entries after enabling
      interrupt-remapping will not make the interrupts through
      io-apic functional anyway.
      
      So remove the unnecessary reinit_intr_remapped_IO_APIC() step.
      
      The longer story:
      
      When interrupt-remapping is enabled, IO-APIC entries need to be
      setup in the re-mappable format (pointing to
      interrupt-remapping table entries setup by the OS). This
      remapping configuration is happening in the same place where we
      traditionally configure IO-APIC (i.e., in
      setup_IO_APIC_irqs()).
      
      So when we enable interrupt-remapping successfully, there is no
      need to restore old io-apic RTE entries before we actually do a
      complete configuration shortly in setup_IO_APIC_irqs(). Old
      IO-APIC RTE's may be in traditional format (non re-mappable) or
      in re-mappable format pointing to interrupt-remapping table
      entries setup by BIOS. Restoring both of these will not make
      IO-APIC functional. We have to rely on setup_IO_APIC_irqs() for
      proper configuration by OS.
      
      So I am removing this unnecessary and broken step.
      
      [ Impact: remove unnecessary/broken IO-APIC setup step ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Acked-by: NWeidong Han <weidong.han@intel.com>
      Cc: dwmw2@infradead.org
      LKML-Reference: <20090420200450.552359000@linux-os.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ff166cb5