1. 05 5月, 2010 13 次提交
    • E
      x86, acpi/irq: Handle isa irqs that are not identity mapped to gsi's. · 988856ee
      Eric W. Biederman 提交于
      ACPI irq source overrides are allowed for the 16 isa irqs and are
      allowed to map any gsi to any isa irq.  A few motherboards have been
      seen to take advantage of this and put the isa irqs on the 2nd or
      3rd ioapic.  This causes some problems, most notably the fact
      that we can not use any gsi < 16.
      
      To correct this move the gsis that are not isa irqs and have
      a gsi number < 16 into the linux irq space just past gsi_end.
      This is what the es7000 platform is doing today.  Moving only the
      low 16 gsis above the rest of the gsi's only penalizes weird
      platforms, leaving sane acpi implementations with a 1-1 mapping
      of gsis and irqs.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-14-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      988856ee
    • E
      x86, ioapic: Simplify probe_nr_irqs_gsi. · 4afc51a8
      Eric W. Biederman 提交于
      Use the global gsi_end value now that all ioapics have
      valid gsi numbers instead of a combination of acpi_probe_gsi
      and walking all of the ioapics and couting their number of
      entries by hand if acpi_probe_gsi gave us an answer we did
      not like.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-13-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4afc51a8
    • E
      x86, ioapic: Optimize pin_2_irq · d464207c
      Eric W. Biederman 提交于
      Now that all ioapics have valid gsi_base values use this to
      accellerate pin_2_irq.  In the case of acpi this also ensures
      that pin_2_irq will compute the same irq value for an ioapic
      pin as acpi will.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-12-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      d464207c
    • E
      x86, ioapic: Move nr_ioapic_registers calculation to mp_register_ioapic. · 7716a5c4
      Eric W. Biederman 提交于
      Now that all ioapic registration happens in mp_register_ioapic we can
      move the calculation of nr_ioapic_registers there from enable_IO_APIC.
      The number of ioapic registers is already calucated in mp_register_ioapic
      so all that really needs to be done is to save the caluclated value
      in nr_ioapic_registers.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-11-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      7716a5c4
    • E
      x86, ioapic: In mpparse use mp_register_ioapic · cf7500c0
      Eric W. Biederman 提交于
      Long ago MP_ioapic_info was the primary way of setting up our
      ioapic data structures and mp_register_ioapic was a compatibility
      shim for acpi code.  Now the situation is reversed and
      and mp_register_ioapic is the primary way of setting up our
      ioapic data structures.
      
      Keep the setting up of ioapic data structures uniform by
      having mp_register_ioapic call mp_register_ioapic.
      
      This changes a few fields:
      
      - type: is now hardset to MP_IOAPIC but type had to
        bey MP_IOAPIC or MP_ioapic_info would not have been called.
      
      - flags: is now hard coded to MPC_APIC_USABLE.
        We require flags to contain at least MPC_APIC_USEBLE in
        MP_ioapic_info and we don't ever examine flags so dropping
        a few flags that might possibly exist that we have never
        used is harmless.
      
      - apicaddr: Unchanged
      
      - apicver: Read from the ioapic instead of using the cached
        hardware value in the MP table.  The real hardware value
        will be more accurate.
      
      - apicid: Now verified to be unique and changed if it is not.
        If the BIOS got this right this is a noop.  If the BIOS did
        not fixing things appears to be the better solution.
      
      This adds gsi_base and gsi_end values to our ioapics defined with
      the mpatable, which will make our lives simpler later since
      we can always assume gsi_base and gsi_end are valid.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-10-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      cf7500c0
    • E
      x86, ioapic: Teach mp_register_ioapic to compute a global gsi_end · 5777372a
      Eric W. Biederman 提交于
      Add the global variable gsi_end and teach mp_register_ioapic
      to keep it uptodate as we add more ioapics into the system.
      
      ioapics can only be added early in boot so the code that
      runs later can treat gsi_end as a constant.
      
      Remove the have hacks in sfi.c to second guess mp_register_ioapic
      by keeping t's own running total of how many gsi's have been seen,
      and instead use the gsi_end.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-9-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5777372a
    • E
      x86, ioapic: Fix the types of gsi values · eddb0c55
      Eric W. Biederman 提交于
      This patches fixes the types of gsi_base and gsi_end values in
      struct mp_ioapic_gsi, and the gsi parameter of mp_find_ioapic
      and mp_find_ioapic_pin
      
      A gsi is cannonically a u32, not an int.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-8-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      eddb0c55
    • E
      x86, ioapic: Fix io_apic_redir_entries to return the number of entries. · 4b6b19a1
      Eric W. Biederman 提交于
      io_apic_redir_entries has a huge conceptual bug.  It returns the maximum
      redirection entry not the number of redirection entries.  Which simply
      does not match what the name of the function.  This just caught me
      and it caught  Feng Tang, and  Len Brown when they wrote sfi_parse_ioapic.
      
      Modify io_apic_redir_entries to actually return the number of redirection
      entries, and fix the callers so that they properly handle receiving the
      number of the number of redirection table entries, instead of the
      number of redirection table entries less one.
      
      While the usage in sfi.c does not show up in this patch it is fixed
      by virtue of the fact that io_apic_redir_entries now has the semantics
      sfi_parse_ioapic most reasonably expects.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-7-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4b6b19a1
    • E
      x86, ioapic: Only export mp_find_ioapic and mp_find_ioapic_pin in io_apic.h · 9638fa52
      Eric W. Biederman 提交于
      Multiple declarations of the same function in different headers
      is a pain to maintain.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-6-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9638fa52
    • E
      x86, acpi/irq: Generalize mp_config_acpi_legacy_irqs · 0fd52670
      Eric W. Biederman 提交于
      Remove the assumption that there is not an override for isa irq 0.
      Instead lookup the gsi and from that lookup the ioapic and pin of each
      isa irq indivdually.
      
      In general this should not have any behavioural affect but in
      perverse cases this gets all of the details correct, instead of
      doing something weird.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-5-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      0fd52670
    • E
      x86, acpi/irq: Fix acpi_sci_ioapic_setup so it has both bus_irq and gsi · 9d2062b8
      Eric W. Biederman 提交于
      Currently acpi_sci_ioapic_setup calls mp_override_legacy_irq with
      bus_irq == gsi, which is wrong if we are comming from an override
      Instead pass the bus_irq into acpi_sci_ioapic_setup.
      
      This fix was inspired by a similar fix from:
      Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-4-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9d2062b8
    • E
      x86, acpi/irq: Teach acpi_get_override_irq to take a gsi not an isa_irq · 9a0a91bb
      Eric W. Biederman 提交于
      In perverse acpi implementations the isa irqs are not identity mapped
      to the first 16 gsi.  Furthermore at least the extended interrupt
      resource capability may return gsi's and not isa irqs.  So since
      what we get from acpi is a gsi teach acpi_get_overrride_irq to
      operate on a gsi instead of an isa_irq.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-2-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9a0a91bb
    • E
      x86, acpi/irq: Introduce apci_isa_irq_to_gsi · 2c2df841
      Eric W. Biederman 提交于
      There are a number of cases where the current code makes the assumption
      that isa irqs identity map to the first 16 acpi global system intereupts.
      In most instances that assumption is correct as that is the required
      behaviour in dual i8259 mode and the default behavior in ioapic mode.
      
      However there are some systems out there that take advantage of acpis
      interrupt remapping  for the isa irqs to have a completely different
      mapping of isa_irq to gsi.
      
      Introduce acpi_isa_irq_to_gsi to perform this mapping explicitly in the
      code that needs it.  Initially this will be just the current assumed
      identity mapping to ensure it's introduction does not cause regressions.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1269936436-7039-1-git-send-email-ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      2c2df841
  2. 29 4月, 2010 1 次提交
  3. 27 4月, 2010 1 次提交
  4. 25 4月, 2010 1 次提交
    • D
      VMware Balloon driver · 453dc659
      Dmitry Torokhov 提交于
      This is a standalone version of VMware Balloon driver.  Ballooning is a
      technique that allows hypervisor dynamically limit the amount of memory
      available to the guest (with guest cooperation).  In the overcommit
      scenario, when hypervisor set detects that it needs to shuffle some
      memory, it instructs the driver to allocate certain number of pages, and
      the underlying memory gets returned to the hypervisor.  Later hypervisor
      may return memory to the guest by reattaching memory to the pageframes and
      instructing the driver to "deflate" balloon.
      
      We are submitting a standalone driver because KVM maintainer (Avi Kivity)
      expressed opinion (rightly) that our transport does not fit well into
      virtqueue paradigm and thus it does not make much sense to integrate with
      virtio.
      
      There were also some concerns whether current ballooning technique is the
      right thing.  If there appears a better framework to achieve this we are
      prepared to evaluate and switch to using it, but in the meantime we'd like
      to get this driver upstream.
      
      We want to get the driver accepted in distributions so that users do not
      have to deal with an out-of-tree module and many distributions have
      "upstream first" requirement.
      
      The driver has been shipping for a number of years and users running on
      VMware platform will have it installed as part of VMware Tools even if it
      will not come from a distribution, thus there should not be additional
      risk in pulling the driver into mainline.  The driver will only activate
      if host is VMware so everyone else should not be affected at all.
      Signed-off-by: NDmitry Torokhov <dtor@vmware.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      453dc659
  5. 24 4月, 2010 2 次提交
    • H
      x86: Disable large pages on CPUs with Atom erratum AAE44 · 7a0fc404
      H. Peter Anvin 提交于
      Atom erratum AAE44/AAF40/AAG38/AAH41:
      
      "If software clears the PS (page size) bit in a present PDE (page
      directory entry), that will cause linear addresses mapped through this
      PDE to use 4-KByte pages instead of using a large page after old TLB
      entries are invalidated. Due to this erratum, if a code fetch uses
      this PDE before the TLB entry for the large page is invalidated then
      it may fetch from a different physical address than specified by
      either the old large page translation or the new 4-KByte page
      translation. This erratum may also cause speculative code fetches from
      incorrect addresses."
      
      [http://download.intel.com/design/processor/specupdt/319536.pdf]
      
      Where as commit 211b3d03 seems to
      workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
      opportunity for the bug to occur and does not totally remove it.  This
      patch disables mixed 4K/4MB page tables totally avoiding the page
      splitting and not tripping this processor issue.
      
      This is based on an original patch by Colin King.
      Originally-by: NColin Ian King <colin.king@canonical.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com>
      Cc: <stable@kernel.org>
      7a0fc404
    • H
      x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero · 7ce5a2b9
      H. Peter Anvin 提交于
      When we do a thread switch, we clear the outgoing FS/GS base if the
      corresponding selector is nonzero.  This is taken by __switch_to() as
      an entry invariant; it does not verify that it is true on entry.
      However, copy_thread() doesn't enforce this constraint, which can
      result in inconsistent results after fork().
      
      Make copy_thread() match the behavior of __switch_to().
      Reported-and-tested-by: NSamuel Thibault <samuel.thibault@inria.fr>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4BD1E061.8030605@zytor.com>
      Cc: <stable@kernel.org>
      7ce5a2b9
  6. 23 4月, 2010 1 次提交
  7. 21 4月, 2010 3 次提交
  8. 20 4月, 2010 7 次提交
  9. 14 4月, 2010 1 次提交
    • R
      lguest: stop using KVM hypercall mechanism · 091ebf07
      Rusty Russell 提交于
      This is a partial revert of 4cd8b5e2 "lguest: use KVM hypercalls";
      we revert to using (just as questionable but more reliable) int $15 for
      hypercalls.  I didn't revert the register mapping, so we still use the
      same calling convention as kvm.
      
      KVM in more recent incarnations stopped injecting a fault when a guest
      tried to use the VMCALL instruction from ring 1, so lguest under kvm
      fails to make hypercalls.  It was nice to share code with our KVM
      cousins, but this was overreach.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Matias Zabaljauregui <zabaljauregui@gmail.com>
      Cc: Avi Kivity <avi@redhat.com>
      091ebf07
  10. 09 4月, 2010 2 次提交
    • F
      perf: Fix unsafe frame rewinding with hot regs fetching · ab285f2b
      Frederic Weisbecker 提交于
      When we fetch the hot regs and rewind to the nth caller, it
      might happen that we dereference a frame pointer outside the
      kernel stack boundaries, like in this example:
      
      	perf_trace_sched_switch+0xd5/0x120
              schedule+0x6b5/0x860
              retint_careful+0xd/0x21
      
      Since we directly dereference a userspace frame pointer here while
      rewinding behind retint_careful, this may end up in a crash.
      
      Fix this by simply using probe_kernel_address() when we rewind the
      frame pointer.
      
      This issue will have a much more proper fix in the next version of the
      perf_arch_fetch_caller_regs() API that will only need to rewind to the
      first caller.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Archs <linux-arch@vger.kernel.org>
      ab285f2b
    • B
      x86/PCI: ignore Consumer/Producer bit in ACPI window descriptions · 73a0e614
      Bjorn Helgaas 提交于
      ACPI Address Space Descriptors (used in _CRS) have a Consumer/Producer
      bit that is supposed to distinguish regions that are consumed directly
      by a device from those that are forwarded ("produced") by a bridge.
      But BIOSes have apparently not used this consistently, and Windows
      seems to ignore it, so I think Linux should ignore it as well.
      
      I can't point to any of these supposed broken BIOSes, but since we
      now rely on _CRS by default, I think it's safer to ignore this bit
      from the start.
      
      Here are details of my experiments with how Windows handles it:
          https://bugzilla.kernel.org/show_bug.cgi?id=15701Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      73a0e614
  11. 07 4月, 2010 5 次提交
  12. 06 4月, 2010 1 次提交
    • V
      perf, x86: Enable Nehalem-EX support · 134fbadf
      Vince Weaver 提交于
      According to Intel Software Devel Manual Volume 3B, the
      Nehalem-EX PMU is just like regular Nehalem (except for the
      uncore support, which is completely different).
      Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      134fbadf
  13. 03 4月, 2010 2 次提交
    • S
      x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards · 472a474c
      Suresh Siddha 提交于
      Jan Grossmann reported kernel boot panic while booting SMP
      kernel on his system with a single core cpu. SMP kernels call
      enable_IR_x2apic() from native_smp_prepare_cpus() and on
      platforms where the kernel doesn't find SMP configuration we
      ended up again calling enable_IR_x2apic() from the
      APIC_init_uniprocessor() call in the smp_sanity_check(). Thus
      leading to kernel panic.
      
      Don't call enable_IR_x2apic() and default_setup_apic_routing()
      from APIC_init_uniprocessor() in CONFIG_SMP case.
      
      NOTE: this kind of non-idempotent and assymetric initialization
      sequence is rather fragile and unclean, we'll clean that up
      in v2.6.35. This is the minimal fix for v2.6.34.
      
      Reported-by: Jan.Grossmann@kielnet.net
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <jbarnes@virtuousgeek.org>
      Cc: <david.woodhouse@intel.com>
      Cc: <weidong.han@intel.com>
      Cc: <youquan.song@intel.com>
      Cc: <Jan.Grossmann@kielnet.net>
      Cc: <stable@kernel.org> # [v2.6.32.x, v2.6.33.x]
      LKML-Reference: <1270083887.7835.78.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      472a474c
    • T
      perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernels · 257ef9d2
      Torok Edwin 提交于
      When profiling a 32-bit process on a 64-bit kernel, callgraph tracing
      stopped after the first function, because it has seen a garbage memory
      address (tried to interpret the frame pointer, and return address as a
      64-bit pointer).
      
      Fix this by using a struct stack_frame with 32-bit pointers when the
      TIF_IA32 flag is set.
      
      Note that TIF_IA32 flag must be used, and not is_compat_task(), because
      the latter is only set when the 32-bit process is executing a syscall,
      which may not always be the case (when tracing page fault events for
      example).
      Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: x86@kernel.org
      Cc: linux-kernel@vger.kernel.org
      LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      257ef9d2