1. 02 8月, 2011 1 次提交
    • J
      PCI: Set PCI-E Max Payload Size on fabric · b03e7495
      Jon Mason 提交于
      On a given PCI-E fabric, each device, bridge, and root port can have a
      different PCI-E maximum payload size.  There is a sizable performance
      boost for having the largest possible maximum payload size on each PCI-E
      device.  However, if improperly configured, fatal bus errors can occur.
      Thus, it is important to ensure that PCI-E payloads sends by a device
      are never larger than the MPS setting of all devices on the way to the
      destination.
      
      This can be achieved two ways:
      
      - A conservative approach is to use the smallest common denominator of
        the entire tree below a root complex for every device on that fabric.
      
      This means for example that having a 128 bytes MPS USB controller on one
      leg of a switch will dramatically reduce performances of a video card or
      10GE adapter on another leg of that same switch.
      
      It also means that any hierarchy supporting hotplug slots (including
      expresscard or thunderbolt I suppose, dbl check that) will have to be
      entirely clamped to 128 bytes since we cannot predict what will be
      plugged into those slots, and we cannot change the MPS on a "live"
      system.
      
      - A more optimal way is possible, if it falls within a couple of
        constraints:
      * The top-level host bridge will never generate packets larger than the
        smallest TLP (or if it can be controlled independently from its MPS at
        least)
      * The device will never generate packets larger than MPS (which can be
        configured via MRRS)
      * No support of direct PCI-E <-> PCI-E transfers between devices without
        some additional code to specifically deal with that case
      
      Then we can use an approach that basically ignores downstream requests
      and focuses exclusively on upstream requests. In that case, all we need
      to care about is that a device MPS is no larger than its parent MPS,
      which allows us to keep all switches/bridges to the max MPS supported by
      their parent and eventually the PHB.
      
      In this case, your USB controller would no longer "starve" your 10GE
      Ethernet and your hotplug slots won't affect your global MPS.
      Additionally, the hotplugged devices themselves can be configured to a
      larger MPS up to the value configured in the hotplug bridge.
      
      To choose between the two available options, two PCI kernel boot args
      have been added to the PCI calls.  "pcie_bus_safe" will provide the
      former behavior, while "pcie_bus_perf" will perform the latter behavior.
      By default, the latter behavior is used.
      
      NOTE: due to the location of the enablement, each arch will need to add
      calls to this function.  This patch only enables x86.
      
      This patch includes a number of changes recommended by Benjamin
      Herrenschmidt.
      
      Tested-by: Jordan_Hargrave@dell.com
      Signed-off-by: NJon Mason <mason@myri.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      b03e7495
  2. 22 7月, 2011 1 次提交
  3. 02 6月, 2011 1 次提交
    • M
      x86/PCI/ACPI: fix type mismatch · 6e33a852
      Márton Németh 提交于
      The flags field of struct resource from linux/ioport.h is "unsigned
      long". Change the "type" parameter of coalesce_windows() function to
      match that field. This fixes the following warning messages when
      compiling with "make C=1 W=1 bzImage modules":
      
      arch/x86/pci/acpi.c: In function ‘coalesce_windows’:
      arch/x86/pci/acpi.c:198: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result
      arch/x86/pci/acpi.c:203: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result
      Signed-off-by: NMárton Németh <nm127@freemail.hu>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      6e33a852
  4. 12 11月, 2010 1 次提交
    • B
      x86/PCI: coalesce overlapping host bridge windows · 4723d0f2
      Bjorn Helgaas 提交于
      Some BIOSes provide PCI host bridge windows that overlap, e.g.,
      
          pci_root PNP0A03:00: host bridge window [mem 0xb0000000-0xffffffff]
          pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xdfffffff]
          pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xffffffff]
      
      If we simply insert these as children of iomem_resource, the second window
      fails because it conflicts with the first, and the third is inserted as a
      child of the first, i.e.,
      
          b0000000-ffffffff PCI Bus 0000:00
            f0000000-ffffffff PCI Bus 0000:00
      
      When we claim PCI device resources, this can cause collisions like this
      if we put them in the first window:
      
          pci 0000:00:01.0: address space collision: [mem 0xff300000-0xff4fffff] conflicts with PCI Bus 0000:00 [mem 0xf0000000-0xffffffff]
      
      Host bridge windows are top-level resources by definition, so it doesn't
      make sense to make the third window a child of the first.  This patch
      coalesces any host bridge windows that overlap.  For the example above,
      the result is this single window:
      
          pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xffffffff]
      
      This fixes a 2.6.34 regression.
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=17011Reported-and-tested-by: NAnisse Astier <anisse@astier.eu>
      Reported-and-tested-by: NPramod Dematagoda <pmd.lotr.gandalf@gmail.com>
      Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      4723d0f2
  5. 31 7月, 2010 1 次提交
    • B
      x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN · 2491762c
      Bjorn Helgaas 提交于
      This DMI quirk turns on "pci=use_crs" for the ALiveSATA2-GLAN because
      amd_bus.c doesn't handle this system correctly.
      
      The system has a single HyperTransport I/O chain, but has two PCI host
      bridges to buses 00 and 80.  amd_bus.c learns the MMIO range associated
      with buses 00-ff and that this range is routed to the HT chain hosted at
      node 0, link 0:
      
          bus: [00, ff] on node 0 link 0
          bus: 00 index 1 [mem 0x80000000-0xfcffffffff]
      
      This includes the address space for both bus 00 and bus 80, and amd_bus.c
      assumes it's all routed to bus 00.
      
      We find device 80:01.0, which BIOS left in the middle of that space, but
      we don't find a bridge from bus 00 to bus 80, so we conclude that 80:01.0
      is unreachable from bus 00, and we move it from the original, working,
      address to something outside the bus 00 aperture, which does not work:
      
          pci 0000:80:01.0: reg 10: [mem 0xfebfc000-0xfebfffff 64bit]
          pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit]
      
      The BIOS told us everything we need to know to handle this correctly,
      so we're better off if we just pay attention, which lets us leave the
      80:01.0 device at the original, working, address:
      
          ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
          pci_root PNP0A03:00: host bridge window [mem 0x80000000-0xff37ffff]
          ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])
          pci_root PNP0A08:00: host bridge window [mem 0xfebfc000-0xfebfffff]
      
      This was a regression between 2.6.33 and 2.6.34.  In 2.6.33, amd_bus.c
      was used only when we found multiple HT chains.  3e3da00c, which
      enabled amd_bus.c even on systems with a single HT chain, caused this
      failure.
      
      This quirk was written by Graham.  If we ever enable "pci=use_crs" for
      machines from 2006 or earlir, this quirk should be removed.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16007
      
      Cc: stable@kernel.org
      Reported-by: NGraham Ramsey <ramsey.graham@ntlworld.com>
      Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2491762c
  6. 25 5月, 2010 1 次提交
  7. 29 4月, 2010 1 次提交
  8. 23 4月, 2010 1 次提交
  9. 09 4月, 2010 1 次提交
  10. 04 4月, 2010 1 次提交
  11. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  12. 26 3月, 2010 2 次提交
  13. 24 2月, 2010 2 次提交
  14. 20 2月, 2010 2 次提交
    • T
      x86: Add pci_init_irq to x86_init · ab3b3793
      Thomas Gleixner 提交于
      Moorestown wants to reuse pcibios_init_irq but needs to provide its
      own implementation of pci_enable_irq. After we distangled the init we
      can move the init_irq call to x86_init and remove the pci_enable_irq
      != NULL check in pcibios_init_irq. pci_enable_irq is compile time
      initialized to pirq_enable_irq and the special cases which override it
      (visws and acpi) set the x86_init function pointer to noop. That
      allows MSRT to override pci_enable_irq and otherwise run
      pcibios_init_irq unmodified.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80CFF@orsmsx508.amr.corp.intel.com>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ab3b3793
    • T
      x86: Move pci init function to x86_init · b72d0db9
      Thomas Gleixner 提交于
      The PCI initialization in pci_subsys_init() is a mess. pci_numaq_init,
      pci_acpi_init, pci_visws_init and pci_legacy_init are called and each
      implementation checks and eventually modifies the global variable
      pcibios_scanned.
      
      x86_init functions allow us to do this more elegant. The pci.init
      function pointer is preset to pci_legacy_init. numaq, acpi and visws
      can modify the pointer in their early setup functions. The functions
      return 0 when they did the full initialization including bus scan. A
      non zero return value indicates that pci_legacy_init needs to be
      called either because the selected function failed or wants the
      generic bus scan in pci_legacy_init to happen (e.g. visws).
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80CFE@orsmsx508.amr.corp.intel.com>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b72d0db9
  15. 07 11月, 2009 1 次提交
  16. 05 11月, 2009 5 次提交
  17. 01 7月, 2009 2 次提交
  18. 25 6月, 2009 1 次提交
  19. 17 6月, 2009 1 次提交
    • G
      x86/ACPI: Correct maximum allowed _CRS returned resources and warn if exceeded · f9cde5ff
      Gary Hade 提交于
      Issue a warning if _CRS returns too many resource descriptors to be
      accommodated by the fixed size resource array instances.  If there is no
      transparent bridge on the root bus "too many" is the
      PCI_BUS_NUM_RESOURCES size of the resource array.  Otherwise, the last 3
      slots of the resource array must be excluded making the maximum
      (PCI_BUS_NUM_RESOURCES - 3).
      
      The current code:
       - is silent when _CRS returns too many resource descriptors and
       - incorrectly allows use of the last 3 slots of the resource array
         for a root bus with a transparent bridge
      Signed-off-by: NGary Hade <garyhade@us.ibm.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      f9cde5ff
  20. 12 6月, 2009 1 次提交
  21. 08 1月, 2009 2 次提交
  22. 30 12月, 2008 1 次提交
  23. 24 7月, 2008 1 次提交
    • M
      x86: PIC, L-APIC and I/O APIC debug information · 32f71aff
      Maciej W. Rozycki 提交于
       Dump all the PIC, local APIC and I/O APIC information at the
      fs_initcall() level, which is after ACPI (if used) has initialised PCI
      information, making the point of invocation consistent across MP-table and
      ACPI platforms.  Remove explicit calls to print_IO_APIC() from elsewhere.
      Make the interface of all the functions involved consistent between 32-bit
      and 64-bit versions and make them all static by default by the means of a
      New-and-Improved(TM) __apicdebuginit() macro.
      
       Note that like print_IO_APIC() all these only output anything if
      "apic=debug" has been passed to the kernel through the command line.
      Signed-off-by: NMaciej W. Rozycki <macro@linux-mips.org>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      32f71aff
  24. 17 7月, 2008 1 次提交
  25. 16 7月, 2008 1 次提交
    • A
      x86/PCI: ACPI based PCI gap calculation · 809d9a8f
      Alok Kataria 提交于
      Using ACPI to find free address space allows us to find a gap for the
      unallocated PCI resources or MMIO resources for hotplug devices within
      the BIOS allowed PCI regions.
      
      It works by evaluating the _CRS object under PCI0 looking for producer
      resources.  Then searches the e820 memory space for a gap within these
      producer resources.
      Signed-off-by: NAlok N Kataria <akataria@vmware.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      809d9a8f
  26. 09 7月, 2008 1 次提交
    • R
      x86/pci: removing subsys_initcall ordering dependencies · 8dd779b1
      Robert Richter 提交于
      So far subsys_initcalls has been executed in this order depending on
      the object order in the Makefile:
      
      arch/x86/pci/visws.c:subsys_initcall(pcibios_init);
      arch/x86/pci/numa.c:subsys_initcall(pci_numa_init);
      arch/x86/pci/acpi.c:subsys_initcall(pci_acpi_init);
      arch/x86/pci/legacy.c:subsys_initcall(pci_legacy_init);
      arch/x86/pci/irq.c:subsys_initcall(pcibios_irq_init);
      arch/x86/pci/common.c:subsys_initcall(pcibios_init);
      
      This patch removes the ordering dependency. There is now only one
      subsys_initcall function that contains subsystem initialization code
      with a defined order.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8dd779b1
  27. 08 7月, 2008 3 次提交
  28. 06 5月, 2008 1 次提交
  29. 27 4月, 2008 1 次提交
    • Y
      x86: get mp_bus_to_node early · 871d5f8d
      Yinghai Lu 提交于
      Currently, on an amd k8 system with multi ht chains, the numa_node of
      pci devices under /sys/devices/pci0000:80/* is always 0, even if that
      chain is on node 1 or 2 or 3.
      
      Workaround: pcibus_to_node(bus) is used when we want to get the node that
      pci_device is on.
      
      In struct device, we already have numa_node member, and we could use
      dev_to_node()/set_dev_node() to get and set numa_node in the device.
      set_dev_node is called in pci_device_add() with pcibus_to_node(bus),
      and pcibus_to_node uses bus->sysdata for nodeid.
      
      The problem is when pci_add_device is called, bus->sysdata is not assigned
      correct nodeid yet. The result is that numa_node will always be 0.
      
      pcibios_scan_root and pci_scan_root could take sysdata. So we need to get
      mp_bus_to_node mapping before these two are called, and thus
      get_mp_bus_to_node could get correct node for sysdata in root bus.
      
      In scanning of the root bus, all child busses will take parent bus sysdata.
      So all pci_device->dev.numa_node will be assigned correctly and automatically.
      
      Later we could use dev_to_node(&pci_dev->dev) to get numa_node, and we
      could also could make other bus specific device get the correct numa_node
      too.
      
      This is an updated version of pci_sysdata and Jeff's pci_domain patch.
      
      [ mingo@elte.hu: build fix ]
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      871d5f8d