1. 08 8月, 2009 1 次提交
    • Y
      x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus · 087d7e56
      Yinghai Lu 提交于
      found a system where x2apic reports an MSI-X irq initialization
      failure:
      
      [  302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
      [  302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
      [  302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
      [  302.894386] igbvf 0000:81:10.4: enabling bus mastering
      [  302.898171] igbvf 0000:81:10.4: setting latency timer to 64
      [  302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.940367]   alloc irq_desc for 265 on node 4
      [  302.956874]   alloc kstat_irqs on node 4
      [  302.959452] alloc irq_2_iommu on node 0
      [  302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
      [  302.977778]   alloc irq_desc for 266 on node 4
      [  302.980347]   alloc kstat_irqs on node 4
      [  302.995312] free_memtype request 0xefb28000-0xefb29000
      [  302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.
      
      ... it turns out that when trying to enable MSI-X,
      __assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
      get vector because for x2apic target-cpus returns cpumask_of(0)
      
      Update that to online_mask like xapic.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <4A785AFF.3050902@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      087d7e56
  2. 04 8月, 2009 5 次提交
    • J
      x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq() · 2a5ef416
      Jack Steiner 提交于
      In uv_setup_irq(), the call to create_irq() initially assigns
      IRQ vectors to cpu 0. The subsequent call to
      assign_irq_vector() in arch_enable_uv_irq() migrates the IRQ to
      another cpu and frees the cpu 0 vector - at least it will be
      freed as soon as the "IRQ move" completes.
      
      arch_enable_uv_irq() needs to send a cleanup IPI to complete
      the IRQ move. Otherwise, assignment of GRU interrupts on large
      systems (>200 cpus) will exhaust the cpu 0 interrupt vectors
      and initialization of the GRU driver will fail.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090720142840.GA8885@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2a5ef416
    • Y
      x86: Don't use current_cpu_data in x2apic phys_pkg_id · d8c7eb34
      Yinghai Lu 提交于
      One system has socket 1 come up as BSP.
      
      kexeced kernel reports BSP as:
      
      [    1.524550] Initializing cgroup subsys cpuacct
      [    1.536064] initial_apicid:20
      [    1.537135] ht_mask_width:1
      [    1.538128] core_select_mask:f
      [    1.539126] core_plus_mask_width:5
      [    1.558479] CPU: Physical Processor ID: 0
      [    1.559501] CPU: Processor Core ID: 0
      [    1.560539] CPU: L1 I cache: 32K, L1 D cache: 32K
      [    1.579098] CPU: L2 cache: 256K
      [    1.580085] CPU: L3 cache: 24576K
      [    1.581108] CPU 0/0x20 -> Node 0
      [    1.596193] CPU 0 microcode level: 0xffff0008
      
      It doesn't have correct physical processor id and will get an
      error:
      
      [   38.840859] CPU0 attaching sched-domain:
      [   38.848287]  domain 0: span 0,8,72 level SIBLING
      [   38.851151]   groups: 0 8 72
      [   38.858137]   domain 1: span 0,8-15,72-79 level MC
      [   38.868944]    groups: 0,8,72 9,73 10,74 11,75 12,76 13,77 14,78 15,79
      [   38.881383] ERROR: parent span is not a superset of domain->span
      [   38.890724]    domain 2: span 0-7,64-71 level CPU
      [   38.899237] ERROR: domain->groups does not contain CPU0
      [   38.909229]     groups: 8-15,72-79
      [   38.912547] ERROR: groups don't span domain->span
      [   38.919665]     domain 3: span 0-127 level NODE
      [   38.930739]      groups: 0-7,64-71 8-15,72-79 16-23,80-87 24-31,88-95 32-39,96-103 40-47,104-111 48-55,112-119 56-63,120-127
      
      it turns out: we can not use current_cpu_data in phys_pgd_id
      for x2apic.
      
      identify_boot_cpu() is called by check_bugs() before
      smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data
      for bsp is assigned with boot_cpu_data.
      
      Just make phys_pkg_id for x2apic is aligned to xapic.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A6ADD0D.10002@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d8c7eb34
    • J
      x86, UV: Fix UV apic mode · c5997fa8
      Jack Steiner 提交于
      Change SGI UV default apicid mode to "physical". This is
      required to match settings in the UV hub chip.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090727143856.GA8905@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c5997fa8
    • J
      x86, UV: Delete mapping of MMR rangs mapped by BIOS · cc5e4fa1
      Jack Steiner 提交于
      The UV BIOS has added additional MMR ranges that are mapped via
      EFI virtual mode mappings. These ranges should be deleted from
      ranges mapped by uv_system_init().
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143656.GA7698@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc5e4fa1
    • J
      x86, UV: Handle missing blade-local memory correctly · 6c7184b7
      Jack Steiner 提交于
      UV blades may not have any blade-local memory. Add a field
      (nid) to the UV blade structure to indicates whether the node
      has local memory. This is needed by the GRU driver (pushed
      separately).
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143507.GA7006@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6c7184b7
  3. 13 7月, 2009 2 次提交
    • R
      x86, apic: Fix false positive section mismatch in numaq_32.c · 7473727b
      Rakib Mullick 提交于
      The variable apic_numaq placed in noninit section references the
      function wakeup_secondary_cpu_via_nmi(), which is in __cpuinit
      section. Thus causes a section mismatch warning. To avoid such
      mismatch we mark apic_numaq as __refdata.
      
      We were warned by the following warning:
      
        WARNING: arch/x86/kernel/built-in.o(.data+0x932c): Section mismatch in
        reference from the variable apic_numaq to the function
        .cpuinit.text:wakeup_secondary_cpu_via_nmi()
      Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
      LKML-Reference: <b9df5fa10907120407p6b4f67dtf4d563155488188a@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7473727b
    • R
      x86: Fix false positive section mismatch in es7000_32.c · 151586d0
      Rakib Mullick 提交于
      The variable apic_es7000_cluster references the function __cpuinit
      wakeup_secondary_cpu_via_mip() from a noninit section. So we've been
      warned by the following warning. To avoid possible collision between
      init/noninit, its best to mark the variable as __refdata.
      
      We were warned by the following warning:
      
        LD      arch/x86/kernel/apic/built-in.o
        WARNING: arch/x86/kernel/apic/built-in.o(.data+0x198c): Section
        mismatch in reference from the variable apic_es7000_cluster to the
        function .cpuinit.text:wakeup_secondary_cpu_via_mip()
      Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
      LKML-Reference: <b9df5fa10907120404k6279a10ch5e9682432272706f@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      151586d0
  4. 11 7月, 2009 1 次提交
    • Y
      x86/pci: insert ioapic resource before assigning unassigned resources · 857fdc53
      Yinghai Lu 提交于
      Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?)
      
      Dann bisected it down to:
        commit 30a18d6c
        Date:   Tue Feb 19 03:21:20 2008 -0800
      
            x86: multi pci root bus with different io resource range, on
            64-bit
      
      It turns out that:
        1. that AMD-based systems have two HT chains.
        2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc
        3. that multi-peer-root patch will try to split root resources to peer
           root resources according to PCI conf of NB
        4. PCI core assigns unassigned resources, but they overlap with BARs
           that are used by ioapic addr of io4 and 8132.
      
      The reason: at that point ioapic address are not inserted yet.  Solution
      is to insert ioapic resources into the tree a bit earlier.
      Reported-by: NStephen Frost <sfrost@snowman.net>
      Reported-and-Tested-by: Ndann frazier <dannf@hp.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: stable@kernel.org
      Signed-off-by: NJesse Barnes <jbarnes@jbarnes-g45.(none)>
      857fdc53
  5. 09 7月, 2009 1 次提交
  6. 03 7月, 2009 1 次提交
  7. 02 7月, 2009 1 次提交
    • I
      x86: Fix printk call in print_local_apic() · 251e1e44
      Ingo Molnar 提交于
      Instead of this:
      
      [   75.690022] <7>printing local APIC contents on CPU#0/0:
      [   75.704406] ... APIC ID:      00000000 (0)
      [   75.707905] ... APIC VERSION: 00060015
      [   75.722551] ... APIC TASKPRI: 00000000 (00)
      [   75.725473] ... APIC PROCPRI: 00000000
      [   75.728592] ... APIC LDR: 00000001
      [   75.742137] ... APIC SPIV: 000001ff
      [   75.744101] ... APIC ISR field:
      [   75.746648] 0123456789abcdef0123456789abcdef
      [   75.746649] <7>00000000000000000000000000000000
      
      Improve the code to be saner and simpler and just print out
      the bitfield in a single line using hexa values - not as a
      (rather pointless) binary bitfield.
      
      Partially reused Linus's initial fix for this.
      Reported-and-Tested-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A4C43BC.90506@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      251e1e44
  8. 24 6月, 2009 1 次提交
    • W
      Intel-IOMMU, intr-remap: source-id checking · f007e99c
      Weidong Han 提交于
      To support domain-isolation usages, the platform hardware must be
      capable of uniquely identifying the requestor (source-id) for each
      interrupt message. Without source-id checking for interrupt remapping
      , a rouge guest/VM with assigned devices can launch interrupt attacks
      to bring down anothe guest/VM or the VMM itself.
      
      This patch adds source-id checking for interrupt remapping, and then
      really isolates interrupts for guests/VMs with assigned devices.
      
      Because PCI subsystem is not initialized yet when set up IOAPIC
      entries, use read_pci_config_byte to access PCI config space directly.
      Signed-off-by: NWeidong Han <weidong.han@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      f007e99c
  9. 18 6月, 2009 2 次提交
    • C
      x86, ioapic: Don't call disconnect_bsp_APIC if no APIC present · 3f4c3955
      Cyrill Gorcunov 提交于
      Vegard Nossum reported:
      
      [  503.576724] ACPI: Preparing to enter system sleep state S5
      [  503.710857] Disabling non-boot CPUs ...
      [  503.716853] Power down.
      [  503.717770] ------------[ cut here ]------------
      [  503.717770] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_du)
      [  503.717770] Hardware name: OptiPlex GX100
      [  503.717770] Modules linked in:
      [  503.717770] Pid: 2136, comm: halt Not tainted 2.6.30 #443
      [  503.717770] Call Trace:
      [  503.717770]  [<c154d327>] ? printk+0x18/0x1a
      [  503.717770]  [<c1017358>] ? native_apic_write_dummy+0x38/0x50
      [  503.717770]  [<c10360fc>] warn_slowpath_common+0x6c/0xc0
      [  503.717770]  [<c1017358>] ? native_apic_write_dummy+0x38/0x50
      [  503.717770]  [<c1036165>] warn_slowpath_null+0x15/0x20
      [  503.717770]  [<c1017358>] native_apic_write_dummy+0x38/0x50
      [  503.717770]  [<c1017173>] disconnect_bsp_APIC+0x63/0x100
      [  503.717770]  [<c1019e48>] disable_IO_APIC+0xb8/0xc0
      [  503.717770]  [<c1214231>] ? acpi_power_off+0x0/0x29
      [  503.717770]  [<c1015e55>] native_machine_shutdown+0x65/0x80
      [  503.717770]  [<c1015c36>] native_machine_power_off+0x26/0x30
      [  503.717770]  [<c1015c49>] machine_power_off+0x9/0x10
      [  503.717770]  [<c1046596>] kernel_power_off+0x36/0x40
      [  503.717770]  [<c104680d>] sys_reboot+0xfd/0x1f0
      [  503.717770]  [<c109daa0>] ? perf_swcounter_event+0xb0/0x130
      [  503.717770]  [<c109db7d>] ? perf_counter_task_sched_out+0x5d/0x120
      [  503.717770]  [<c102dfc6>] ? finish_task_switch+0x56/0xd0
      [  503.717770]  [<c154da1e>] ? schedule+0x49e/0xb40
      [  503.717770]  [<c10444b0>] ? sys_kill+0x70/0x160
      [  503.717770]  [<c119d9db>] ? selinux_file_ioctl+0x3b/0x50
      [  503.717770]  [<c10dd443>] ? sys_ioctl+0x63/0x70
      [  503.717770]  [<c1003024>] sysenter_do_call+0x12/0x22
      [  503.717770] ---[ end trace 8157b5d0ed378f15 ]---
      
      |
      | That's including this commit:
      |
      | commit 103428e5
      |Author: Cyrill Gorcunov <gorcunov@openvz.org>
      |Date:   Sun Jun 7 16:48:40 2009 +0400
      |
      |    x86, apic: Fix dummy apic read operation together with broken MP handling
      |
      
      If we have apic disabled we don't even switch to APIC mode and do not
      calling for connect_bsp_APIC. Though on SMP compiled kernel the
      native_machine_shutdown does try to write the apic register anyway.
      
      Fix it with explicit check if we really should touch apic registers.
      Reported-by: NVegard Nossum <vegard.nossum@gmail.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20090617181322.GG10822@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f4c3955
    • H
      x86: Remove duplicated #include's · 8653f88f
      Huang Weiyi 提交于
      Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
      LKML-Reference: <1244895686-2348-1-git-send-email-weiyi.huang@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8653f88f
  10. 17 6月, 2009 2 次提交
  11. 12 6月, 2009 3 次提交
  12. 09 6月, 2009 1 次提交
    • J
      x86, UV: Fix macros for multiple coherency domains · c4ed3f04
      Jack Steiner 提交于
      Fix bug in the SGI UV macros that support systems with multiple
      coherency domains.  The macros used for referencing global MMR
      (chipset registers) are failing to correctly "or" the NASID
      (node identifier) bits that reside above M+N. These high bits
      are supplied automatically by the chipset for memory accesses
      coming from the processor socket.
      
      However, the bits must be present for references to the special
      global MMR space used to map chipset registers. (See uv_hub.h
      for more details ...)
      
      The bug results in references to invalid/incorrect nodes.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <20090608154405.GA16395@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c4ed3f04
  13. 07 6月, 2009 1 次提交
    • C
      x86, apic: Fix dummy apic read operation together with broken MP handling · 103428e5
      Cyrill Gorcunov 提交于
      Ingo Molnar reported that read_apic is buggy novadays:
      
      [    0.000000] Using APIC driver default
      [    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
      [    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
      [    0.000000] APIC: disable apic facility
      [    0.000000] ------------[ cut here ]------------
      [    0.000000] WARNING: at arch/x86/kernel/apic/apic.c:254 native_apic_read_dummy+0x2d/0x3b()
      [    0.000000] Hardware name: HP OmniBook PC
      
      Indeed we still rely on apic->read operation for SMP compiled
      kernel. And instead of disfigure the SMP code with #ifdef we
      allow to call apic->read. To capture any unexpected results
      we check for apic->read being called for sane reason via
      WARN_ON_ONCE but(!) instead of OR we should use AND logical
      operation (thanks Yinghai for spotting the root of the problem).
      
      Along with that we could be have bad MP table and we are
      to fix it that way no SMP started and no complains about
      BIOS bug if apic was just disabled via command line.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20090607124840.GD4547@lenovo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      103428e5
  14. 02 6月, 2009 2 次提交
    • J
      x86, apic: Restore irqs on fail paths · 3d58829b
      Jiri Slaby 提交于
      lapic_resume forgets to restore interrupts on fail paths.
      Fix that.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <1243497289-18591-1-git-send-email-jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      3d58829b
    • N
      x86: Print real IOAPIC version for x86-64 · 58f892e0
      Naga Chumbalkar 提交于
      Fix the fact that the IOAPIC version number in the x86_64 code path always
      gets assigned to 0, instead of the correct value.
      
      Before the patch: (from "dmesg" output):
      
       ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
       IOAPIC[0]: apic_id 8, version 0, address 0xfec00000, GSI 0-23     <---
      
       After the patch:
       ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
       IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23    <---
      
      History:
      
      io_apic_get_version() was compiled out of the x86_64 code path in the commit
      f2c2cca3:
      
      Author: Andi Kleen <ak@suse.de>
      Date:   Tue Sep 26 10:52:37 2006 +0200
      
          [PATCH] Remove APIC version/cpu capability mpparse checking/printing
      
          ACPI went to great trouble to get the APIC version and CPU capabilities
          of different CPUs before passing them to the mpparser. But all
          that data was used was to print it out.  Actually it even faked some data
          based on the boot cpu, not on the actual CPU being booted.
      
          Remove all this code because it's not needed.
      
          Cc: len.brown@intel.com
      
      At the time, the IOAPIC version number was deliberately not printed
      in the x86_64 code path. However, after the x86 and x86_64 files were
      merged, the net result is that the IOAPIC version is printed incorrectly
      in the x86_64 code path.
      
      The patch below provides a fix. I have tested it with acpi, and with
      acpi=off, and did not see any problems.
      Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>
      Acked-by: NYinghai Lu <yhlu.kernel@gmail.com>
      LKML-Reference: <20090416014230.4885.94926.sendpatchset@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      *************************
      58f892e0
  15. 29 5月, 2009 2 次提交
    • Y
      perf_counter/x86: Always use NMI for performance-monitoring interrupt · c323d95f
      Yong Wang 提交于
      Always use NMI for performance-monitoring interrupt as there could be
      racy situations if we switch between irq and nmi mode frequently.
      Signed-off-by: NYong Wang <yong.y.wang@intel.com>
      LKML-Reference: <20090529052835.GA13657@ywang-moblin2.bj.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c323d95f
    • A
      x86, mce: use 64bit machine check code on 32bit · 4efc0670
      Andi Kleen 提交于
      The 64bit machine check code is in many ways much better than
      the 32bit machine check code: it is more specification compliant,
      is cleaner, only has a single code base versus one per CPU,
      has better infrastructure for recovery, has a cleaner way to communicate
      with user space etc. etc.
      
      Use the 64bit code for 32bit too.
      
      This is the second attempt to do this. There was one a couple of years
      ago to unify this code for 32bit and 64bit.  Back then this ran into some
      trouble with K7s and was reverted.
      
      I believe this time the K7 problems (and some others) are addressed.
      I went over the old handlers and was very careful to retain
      all quirks.
      
      But of course this needs a lot of testing on old systems. On newer
      64bit capable systems I don't expect much problems because they have been
      already tested with the 64bit kernel.
      
      I made this a CONFIG for now that still allows to select the old
      machine check code. This is mostly to make testing easier,
      if someone runs into a problem we can ask them to try
      with the CONFIG switched.
      
      The new code is default y for more coverage.
      
      Once there is confidence the 64bit code works well on older hardware
      too the CONFIG_X86_OLD_MCE and the associated code can be easily
      removed.
      
      This causes a behaviour change for 32bit installations. They now
      have to install the mcelog package to be able to log
      corrected machine checks.
      
      The 64bit machine check code only handles CPUs which support the
      standard Intel machine check architecture described in the IA32 SDM.
      The 32bit code has special support for some older CPUs which
      have non standard machine check architectures, in particular
      WinChip C3 and Intel P5.  I made those a separate CONFIG option
      and kept them for now. The WinChip variant could be probably
      removed without too much pain, it doesn't really do anything
      interesting. P5 is also disabled by default (like it
      was before) because many motherboards have it miswired, but
      according to Alan Cox a few embedded setups use that one.
      
      Forward ported/heavily changed version of old patch, original patch
      included review/fixes from Thomas Gleixner, Bert Wesarg.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4efc0670
  16. 26 5月, 2009 1 次提交
  17. 22 5月, 2009 1 次提交
    • P
      perf_counter: Dynamically allocate tasks' perf_counter_context struct · a63eaf34
      Paul Mackerras 提交于
      This replaces the struct perf_counter_context in the task_struct with
      a pointer to a dynamically allocated perf_counter_context struct.  The
      main reason for doing is this is to allow us to transfer a
      perf_counter_context from one task to another when we do lazy PMU
      switching in a later patch.
      
      This has a few side-benefits: the task_struct becomes a little smaller,
      we save some memory because only tasks that have perf_counters attached
      get a perf_counter_context allocated for them, and we can remove the
      inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't
      end up recompiling nearly everything whenever perf_counter.h changes.
      
      The perf_counter_context structures are reference-counted and freed
      when the last reference is dropped.  A context can have references
      from its task and the counters on its task.  Counters can outlive the
      task so it is possible that a context will be freed well after its
      task has exited.
      
      Contexts are allocated on fork if the parent had a context, or
      otherwise the first time that a per-task counter is created on a task.
      In the latter case, we set the context pointer in the task struct
      locklessly using an atomic compare-and-exchange operation in case we
      raced with some other task in creating a context for the subject task.
      
      This also removes the task pointer from the perf_counter struct.  The
      task pointer was not used anywhere and would make it harder to move a
      context from one task to another.  Anything that needed to know which
      task a counter was attached to was already using counter->ctx->task.
      
      The __perf_counter_init_context function moves up in perf_counter.c
      so that it can be called from find_get_context, and now initializes
      the refcount, but is otherwise unchanged.
      
      We were potentially calling list_del_counter twice: once from
      __perf_counter_exit_task when the task exits and once from
      __perf_counter_remove_from_context when the counter's fd gets closed.
      This adds a check in list_del_counter so it doesn't do anything if
      the counter has already been removed from the lists.
      
      Since perf_counter_task_sched_in doesn't do anything if the task doesn't
      have a context, and leaves cpuctx->task_ctx = NULL, this adds code to
      __perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in
      the case where the current task adds the first counter to itself and
      thus creates a context for itself.
      
      This also adds similar code to __perf_counter_enable to handle a
      similar situation which can arise when the counters have been disabled
      using prctl; that also leaves cpuctx->task_ctx = NULL.
      
      [ Impact: refactor counter context management to prepare for new feature ]
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a63eaf34
  18. 19 5月, 2009 1 次提交
    • Y
      x86, io-apic: Don't mark pin_programmed early · 4c6f18fc
      Yinghai Lu 提交于
      Peter bisected that:
      
      | commit b9c61b70
      | Date:   Wed May 6 10:10:06 2009 -0700
      |
      |     x86/pci: update pirq_enable_irq() to setup io apic routing
      |
      |     So we can set io apic routing only when enabling the device irq.
      
      wrecked his opteron box, ata1 interrupts fail to get through.
      
      ata1 is using irq 11:
      
      [    1.451839] sata_svw 0000:01:0e.0: version 2.3
      [    1.456333] sata_svw 0000:01:0e.0: PCI INT A -> GSI 11 (level, low) -> IRQ 11
      [    1.463639] scsi0 : sata_svw
      [    1.466949] scsi1 : sata_svw
      [    1.470022] scsi2 : sata_svw
      [    1.473090] scsi3 : sata_svw
      [    1.476112] ata1: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe000 irq 11
      [    1.483490] ata2: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe100 irq 11
      [    1.490870] ata3: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe200 irq 11
      [    1.498247] ata4: SATA max UDMA/133 mmio m8192@0xff3fe000 port 0xff3fe300 irq 11
      
      that pin is overlapped with pin with legacy ones.
      
      We should not set bits in pin_programmed here, so that those bit could
      be set later via io_apic_set_pci_routing().
      
      [ Impact: fix boot hang on certain systems ]
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NYinghai Lu <yinghai.lu@kernel.org>
      Tested-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Jack Steiner <steiner@sgi.com>
      LKML-Reference: <4A119990.9020606@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4c6f18fc
  19. 18 5月, 2009 2 次提交
  20. 13 5月, 2009 1 次提交
  21. 12 5月, 2009 1 次提交
    • Y
      x86: read apic ID in the !acpi_lapic case · 4797f6b0
      Yinghai Lu 提交于
      Ed found that on 32-bit, boot_cpu_physical_apicid is not read right,
      when the mptable is broken.
      
      Interestingly, actually three paths use/set it:
      
       1. acpi: at that time that is already read from reg
       2. mptable: only read from mptable
       3. no madt, and no mptable, that use default apic id 0 for 64-bit, -1 for 32-bit
      
      so we could read the apic id for the 2/3 path. We trust the hardware
      register more than we trust a BIOS data structure (the mptable).
      
      We can also avoid the double set_fixmap() when acpi_lapic
      is used, and also need to move cpu_has_apic earlier and
      call apic_disable().
      
      Also when need to update the apic id, we'd better read and
      set the apic version as well - so that quirks are applied precisely.
      
      v2: make path 3 with 64bit, use -1 as apic id, so could read it later.
      v3: fix whitespace problem pointed out by Ed Swierk
      v5: fix boot crash
      
      [ Impact: get correct apic id for bsp other than acpi path ]
      Reported-by: NEd Swierk <eswierk@aristanetworks.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <49FC85A9.2070702@kernel.org>
      [ v4: sanity-check in the ACPI case too ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4797f6b0
  22. 11 5月, 2009 7 次提交