1. 24 2月, 2010 4 次提交
    • J
      x86/PCI: Moorestown PCI support · a712ffbc
      Jesse Barnes 提交于
      The Moorestown platform only has a few devices that actually support
      PCI config cycles.  The rest of the devices use an in-RAM MCFG space
      for the purposes of device enumeration and initialization.
      
      There are a few uglies in the fake support, like BAR sizes that aren't
      a power of two, sizing detection, and writes to the real devices, but
      other than that it's pretty straightforward.
      
      Another way to think of this is not really as PCI at all, but just a
      table in RAM describing which devices are present, their capabilities
      and their offsets in MMIO space.  This could have been done with a
      special new firmware table on this platform, but given that we do have
      some real PCI devices too, simply describing things in an MCFG type
      space was pretty simple.
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80D08@orsmsx508.amr.corp.intel.com>
      Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      a712ffbc
    • J
      x86, ioapic: Add dummy ioapic functions · 4966e1af
      Jacob Pan 提交于
      Some ioapic extern functions are used when CONFIG_X86_IO_APIC is not
      defined.  We need the dummy functions to avoid a compile time error.
      Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
      LKML-Reference: <43F901BD926A4E43B106BF17856F0755A318DA07@orsmsx508.amr.corp.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4966e1af
    • J
      x86, ioapic: Early enable ioapic for timer irq · 05ddafb1
      Jacob Pan 提交于
      Moorestown platform needs apic ready early for the system timer irq
      which is delievered via ioapic.  Should not impact other platforms.
      
      In the longer term, once ioapic setup is moved before late time init,
      we will not need this patch to do early apic enabling.
      Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
      LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80D07@orsmsx508.amr.corp.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      05ddafb1
    • J
      x86, pic: Fix section mismatch in legacy pic · 28a3c93d
      Jacob Pan 提交于
      Move legacy_pic chip dummy functions out of init section as they might
      be referenced at run time.
      Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
      LKML-Reference: <43F901BD926A4E43B106BF17856F0755A318D3AA@orsmsx508.amr.corp.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      28a3c93d
  2. 20 2月, 2010 8 次提交
  3. 19 2月, 2010 1 次提交
    • B
      x86, irq: Keep chip_data in create_irq_nr and destroy_irq · eb5b3794
      Brandon Philips 提交于
      Version 4: use get_irq_chip_data() in destroy_irq() to get rid of some
      local vars.
      
      When two drivers are setting up MSI-X at the same time via
      pci_enable_msix() there is a race.  See this dmesg excerpt:
      
      [   85.170610] ixgbe 0000:02:00.1: irq 97 for MSI/MSI-X
      [   85.170611]   alloc irq_desc for 99 on node -1
      [   85.170613] igb 0000:08:00.1: irq 98 for MSI/MSI-X
      [   85.170614]   alloc kstat_irqs on node -1
      [   85.170616] alloc irq_2_iommu on node -1
      [   85.170617]   alloc irq_desc for 100 on node -1
      [   85.170619]   alloc kstat_irqs on node -1
      [   85.170621] alloc irq_2_iommu on node -1
      [   85.170625] ixgbe 0000:02:00.1: irq 99 for MSI/MSI-X
      [   85.170626]   alloc irq_desc for 101 on node -1
      [   85.170628] igb 0000:08:00.1: irq 100 for MSI/MSI-X
      [   85.170630]   alloc kstat_irqs on node -1
      [   85.170631] alloc irq_2_iommu on node -1
      [   85.170635]   alloc irq_desc for 102 on node -1
      [   85.170636]   alloc kstat_irqs on node -1
      [   85.170639] alloc irq_2_iommu on node -1
      [   85.170646] BUG: unable to handle kernel NULL pointer dereference
      at 0000000000000088
      
      As you can see igb and ixgbe are both alternating on create_irq_nr()
      via pci_enable_msix() in their probe function.
      
      ixgbe: While looping through irq_desc_ptrs[] via create_irq_nr() ixgbe
      choses irq_desc_ptrs[102] and exits the loop, drops vector_lock and
      calls dynamic_irq_init. Then it sets irq_desc_ptrs[102]->chip_data =
      NULL via dynamic_irq_init().
      
      igb: Grabs the vector_lock now and starts looping over irq_desc_ptrs[]
      via create_irq_nr(). It gets to irq_desc_ptrs[102] and does this:
      
      	cfg_new = irq_desc_ptrs[102]->chip_data;
      	if (cfg_new->vector != 0)
      		continue;
      
      This hits the NULL deref.
      
      Another possible race exists via pci_disable_msix() in a driver or in
      the number of error paths that call free_msi_irqs():
      
      destroy_irq()
      dynamic_irq_cleanup() which sets desc->chip_data = NULL
      ...race window...
      desc->chip_data = cfg;
      
      Remove the save and restore code for cfg in create_irq_nr() and
      destroy_irq() and take the desc->lock when checking the irq_cfg.
      Reported-and-analyzed-by: NBrandon Philips <bphilips@suse.de>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20100207210250.GB8256@jenkins.home.ifup.org>
      Signed-off-by: NBrandon Phiilps <bphilips@suse.de>
      Cc: stable@kernel.org
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      eb5b3794
  4. 18 2月, 2010 2 次提交
  5. 17 2月, 2010 4 次提交
  6. 11 2月, 2010 3 次提交
    • B
      x86: Avoid race condition in pci_enable_msix() · ced5b697
      Brandon Phiilps 提交于
      Keep chip_data in create_irq_nr and destroy_irq.
      
      When two drivers are setting up MSI-X at the same time via
      pci_enable_msix() there is a race.  See this dmesg excerpt:
      
      [   85.170610] ixgbe 0000:02:00.1: irq 97 for MSI/MSI-X
      [   85.170611]   alloc irq_desc for 99 on node -1
      [   85.170613] igb 0000:08:00.1: irq 98 for MSI/MSI-X
      [   85.170614]   alloc kstat_irqs on node -1
      [   85.170616] alloc irq_2_iommu on node -1
      [   85.170617]   alloc irq_desc for 100 on node -1
      [   85.170619]   alloc kstat_irqs on node -1
      [   85.170621] alloc irq_2_iommu on node -1
      [   85.170625] ixgbe 0000:02:00.1: irq 99 for MSI/MSI-X
      [   85.170626]   alloc irq_desc for 101 on node -1
      [   85.170628] igb 0000:08:00.1: irq 100 for MSI/MSI-X
      [   85.170630]   alloc kstat_irqs on node -1
      [   85.170631] alloc irq_2_iommu on node -1
      [   85.170635]   alloc irq_desc for 102 on node -1
      [   85.170636]   alloc kstat_irqs on node -1
      [   85.170639] alloc irq_2_iommu on node -1
      [   85.170646] BUG: unable to handle kernel NULL pointer dereference
      at 0000000000000088
      
      As you can see igb and ixgbe are both alternating on create_irq_nr()
      via pci_enable_msix() in their probe function.
      
      ixgbe: While looping through irq_desc_ptrs[] via create_irq_nr() ixgbe
      choses irq_desc_ptrs[102] and exits the loop, drops vector_lock and
      calls dynamic_irq_init. Then it sets irq_desc_ptrs[102]->chip_data =
      NULL via dynamic_irq_init().
      
      igb: Grabs the vector_lock now and starts looping over irq_desc_ptrs[]
      via create_irq_nr(). It gets to irq_desc_ptrs[102] and does this:
      
      	cfg_new = irq_desc_ptrs[102]->chip_data;
      	if (cfg_new->vector != 0)
      		continue;
      
      This hits the NULL deref.
      
      Another possible race exists via pci_disable_msix() in a driver or in
      the number of error paths that call free_msi_irqs():
      
      destroy_irq()
      dynamic_irq_cleanup() which sets desc->chip_data = NULL
      ...race window...
      desc->chip_data = cfg;
      
      Remove the save and restore code for cfg in create_irq_nr() and
      destroy_irq() and take the desc->lock when checking the irq_cfg.
      Reported-and-analyzed-by: NBrandon Philips <bphilips@suse.de>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1265793639-15071-3-git-send-email-yinghai@kernel.org>
      Signed-off-by: NBrandon Phililps <bphilips@suse.de>
      Cc: stable@kernel.org
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ced5b697
    • Y
      x86: Fix SCI on IOAPIC != 0 · 18dce6ba
      Yinghai Lu 提交于
      Thomas Renninger <trenn@suse.de> reported on IBM x3330
      
      booting a latest kernel on this machine results in:
      
      PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
      PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
      ACPI: SCI (IRQ30) allocation failed
      ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
      ACPI: Unable to start the ACPI Interpreter
      
      Later all kind of devices fail...
      
      and bisect it down to this commit:
      commit b9c61b70
      
          x86/pci: update pirq_enable_irq() to setup io apic routing
      
      it turns out we need to set irq routing for the sci on ioapic1 early.
      
      -v2: make it work without sparseirq too.
      -v3: fix checkpatch.pl warning, and cc to stable
      Reported-by: NThomas Renninger <trenn@suse.de>
      Bisected-by: NThomas Renninger <trenn@suse.de>
      Tested-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1265793639-15071-2-git-send-email-yinghai@kernel.org>
      Cc: stable@kernel.org
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      18dce6ba
    • J
      x86, ia32_aout: do not kill argument mapping · 318f6b22
      Jiri Slaby 提交于
      Do not set current->mm->mmap to NULL in 32-bit emulation on 64-bit
      load_aout_binary after flush_old_exec as it would destroy already
      set brpm mapping with arguments.
      
      Introduced by b6a2fea3
      mm: variable length argument support
      where the argument mapping in bprm was added.
      
      [ hpa: this is a regression from 2.6.22... time to kill a.out? ]
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      LKML-Reference: <1265831716-7668-1-git-send-email-jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ollie Wild <aaw@google.com>
      Cc: x86@kernel.org
      Cc: <stable@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      318f6b22
  7. 10 2月, 2010 4 次提交
  8. 03 2月, 2010 2 次提交
  9. 30 1月, 2010 7 次提交
    • J
      perf, hw_breakpoint, kgdb: Do not take mutex for kernel debugger · 5352ae63
      Jason Wessel 提交于
      This patch fixes the regression in functionality where the
      kernel debugger and the perf API do not nicely share hw
      breakpoint reservations.
      
      The kernel debugger cannot use any mutex_lock() calls because it
      can start the kernel running from an invalid context.
      
      A mutex free version of the reservation API needed to get
      created for the kernel debugger to safely update hw breakpoint
      reservations.
      
      The possibility for a breakpoint reservation to be concurrently
      processed at the time that kgdb interrupts the system is
      improbable. Should this corner case occur the end user is
      warned, and the kernel debugger will prohibit updating the
      hardware breakpoint reservations.
      
      Any time the kernel debugger reserves a hardware breakpoint it
      will be a system wide reservation.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: torvalds@linux-foundation.org
      LKML-Reference: <1264719883-7285-3-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5352ae63
    • J
      x86, hw_breakpoints, kgdb: Fix kgdb to use hw_breakpoint API · cc096749
      Jason Wessel 提交于
      In the 2.6.33 kernel, the hw_breakpoint API is now used for the
      performance event counters.  The hw_breakpoint_handler() now
      consumes the hw breakpoints that were previously set by kgdb
      arch specific code.  In order for kgdb to work in conjunction
      with this core API change, kgdb must use some of the low level
      functions of the hw_breakpoint API to install, uninstall, and
      deal with hw breakpoint reservations.
      
      The kgdb core required a change to call kgdb_disable_hw_debug
      anytime a slave cpu enters kgdb_wait() in order to keep all the
      hw breakpoints in sync as well as to prevent hitting a hw
      breakpoint while kgdb is active.
      
      During the architecture specific initialization of kgdb, it will
      pre-allocate 4 disabled (struct perf event **) structures.  Kgdb
      will use these to manage the capabilities for the 4 hw
      breakpoint registers, per cpu.  Right now the hw_breakpoint API
      does not have a way to ask how many breakpoints are available,
      on each CPU so it is possible that the install of a breakpoint
      might fail when kgdb restores the system to the run state.  The
      intent of this patch is to first get the basic functionality of
      hw breakpoints working and leave it to the person debugging the
      kernel to understand what hw breakpoints are in use and what
      restrictions have been imposed as a result.  Breakpoint
      constraints will be dealt with in a future patch.
      
      While atomic, the x86 specific kgdb code will call
      arch_uninstall_hw_breakpoint() and arch_install_hw_breakpoint()
      to manage the cpu specific hw breakpoints.
      
      The net result of these changes allow kgdb to use the same pool
      of hw_breakpoints that are used by the perf event API, but
      neither knows about future reservations for the available hw
      breakpoint slots.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: torvalds@linux-foundation.org
      LKML-Reference: <1264719883-7285-2-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc096749
    • D
      x86: Add quirk for Intel DG45FC board to avoid low memory corruption · 7c099ce1
      David Härdeman 提交于
      Commit 6aa542a6 added a quirk for the
      Intel DG45ID board due to low memory corruption. The Intel DG45FC
      shares the same BIOS (and the same bug) as noted in:
      
        http://bugzilla.kernel.org/show_bug.cgi?id=13736Signed-off-by: NDavid Härdeman <david@hardeman.nu>
      LKML-Reference: <20100128200254.GA9134@hardeman.nu>
      Cc: <stable@kernel.org>
      Cc: Alexey Fisher <bug-track@fisher-privat.net>
      Cc: ykzhao <yakui.zhao@intel.com>
      Cc: Tony Bones <aabonesml@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      7c099ce1
    • S
      x86, irq: Move __setup_vector_irq() before the first irq enable in cpu online path · 9d133e5d
      Suresh Siddha 提交于
      Lowest priority delivery of logical flat mode is broken on some systems,
      such that even when IO-APIC RTE says deliver the interrupt to a particular CPU,
      interrupt subsystem delivers the interrupt to totally different CPU.
      
      For example, this behavior was observed on a P4 based system with SiS chipset
      which was reported by Li Zefan. We have been handling this kind of behavior by
      making sure that in logical flat mode, we assign the same vector to irq
      mappings on all the 8 possible logical cpu's.
      
      But we have been doing this initial assignment (__setup_vector_irq()) a little
      late (before which interrupts were already enabled for a short duration).
      
      Move the __setup_vector_irq() before the first irq enable point in the
      cpu online path to avoid the issue of not handling some interrupts that
      wrongly hit the cpu which is still coming online.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100129194330.283696385@sbs-t61.sc.intel.com>
      Tested-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9d133e5d
    • S
      x86, irq: Update the vector domain for legacy irqs handled by io-apic · 69c89efb
      Suresh Siddha 提交于
      In the recent change of not reserving IRQ0_VECTOR..IRQ15_VECTOR's on all
      cpu's, we start with irq 0..15 getting directed to (and handled on) cpu-0.
      
      In the logical flat mode, once the AP's are online (and before irqbalance
      comes into picture), kernel intends to handle these IRQ's on any cpu (as the
      logical flat mode allows to specify multiple cpu's for the irq destination and
      the chipset based routing can deliver to the interrupt to any one of
      the specified cpu's). This was broken with our recent change, which was ending
      up using only cpu 0 as the destination, even when the kernel was specifying to
      use all online cpu's for the logical flat mode case.
      
      Fix this by updating vector allocation domain (cfg->domain) for legacy irqs,
      when the IO-APIC handles them.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100129194330.207790269@sbs-t61.sc.intel.com>
      Tested-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      69c89efb
    • H
      x86: get rid of the insane TIF_ABI_PENDING bit · 05d43ed8
      H. Peter Anvin 提交于
      Now that the previous commit made it possible to do the personality
      setting at the point of no return, we do just that for ELF binaries.
      And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
      away, and we can just make SET_PERSONALITY() just do the obvious thing
      for a 32-bit compat process.
      
      Everything becomes much more straightforward this way.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05d43ed8
    • L
      Split 'flush_old_exec' into two functions · 221af7f8
      Linus Torvalds 提交于
      'flush_old_exec()' is the point of no return when doing an execve(), and
      it is pretty badly misnamed.  It doesn't just flush the old executable
      environment, it also starts up the new one.
      
      Which is very inconvenient for things like setting up the new
      personality, because we want the new personality to affect the starting
      of the new environment, but at the same time we do _not_ want the new
      personality to take effect if flushing the old one fails.
      
      As a result, the x86-64 '32-bit' personality is actually done using this
      insane "I'm going to change the ABI, but I haven't done it yet" bit
      (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
      personality, but just the "pending" bit, so that "flush_thread()" can do
      the actual personality magic.
      
      This patch in no way changes any of that insanity, but it does split the
      'flush_old_exec()' function up into a preparatory part that can fail
      (still called flush_old_exec()), and a new part that will actually set
      up the new exec environment (setup_new_exec()).  All callers are changed
      to trivially comply with the new world order.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      221af7f8
  10. 29 1月, 2010 1 次提交
  11. 28 1月, 2010 1 次提交
  12. 27 1月, 2010 2 次提交
  13. 25 1月, 2010 1 次提交