1. 05 2月, 2015 1 次提交
    • J
      x86/PCI: Refine the way to release PCI IRQ resources · b4b55cda
      Jiang Liu 提交于
      Some PCI device drivers assume that pci_dev->irq won't change after
      calling pci_disable_device() and pci_enable_device() during suspend and
      resume.
      
      Commit c03b3b07 ("x86, irq, mpparse: Release IOAPIC pin when
      PCI device is disabled") frees PCI IRQ resources when pci_disable_device()
      is called and reallocate IRQ resources when pci_enable_device() is
      called again. This breaks above assumption. So commit 3eec5952
      ("x86, irq, PCI: Keep IRQ assignment for PCI devices during
      suspend/hibernation") and 9eabc99a ("x86, irq, PCI: Keep IRQ
      assignment for runtime power management") fix the issue by avoiding
      freeing/reallocating IRQ resources during PCI device suspend/resume.
      They achieve this by checking dev.power.is_prepared and
      dev.power.runtime_status.  PM maintainer, Rafael, then pointed out that
      it's really an ugly fix which leaking PM internal state information to
      IRQ subsystem.
      
      Recently David Vrabel <david.vrabel@citrix.com> also reports an
      regression in pciback driver caused by commit cffe0a2b ("x86, irq:
      Keep balance of IOAPIC pin reference count"). Please refer to:
      http://lkml.org/lkml/2015/1/14/546
      
      So this patch refine the way to release PCI IRQ resources. Instead of
      releasing PCI IRQ resources in pci_disable_device()/
      pcibios_disable_device(), we now release it at driver unbinding
      notification BUS_NOTIFY_UNBOUND_DRIVER. In other word, we only release
      PCI IRQ resources when there's no driver bound to the PCI device, and
      it keeps the assumption that pci_dev->irq won't through multiple
      invocation of pci_enable_device()/pci_disable_device().
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b4b55cda
  2. 02 2月, 2015 1 次提交
  3. 23 9月, 2014 1 次提交
  4. 28 2月, 2014 1 次提交
  5. 04 2月, 2014 6 次提交
  6. 06 6月, 2013 1 次提交
    • M
      x86/PCI: Map PCI setup data with ioremap() so it can be in highmem · 65694c5a
      Matt Fleming 提交于
      f9a37be0 ("x86: Use PCI setup data") added support for using PCI ROM
      images from setup_data.  This used phys_to_virt(), which is not valid for
      highmem addresses, and can cause a crash when booting a 32-bit kernel via
      the EFI boot stub.
      
      pcibios_add_device() assumes that the physical addresses stored in
      setup_data are accessible via the direct kernel mapping, and that calling
      phys_to_virt() is valid.  This isn't guaranteed to be true on x86 where the
      direct mapping range is much smaller than on x86-64.
      
      Calling phys_to_virt() on a highmem address results in the following:
      
       BUG: unable to handle kernel paging request at 39a3c198
       IP: [<c262be0f>] pcibios_add_device+0x2f/0x90
       ...
       Call Trace:
        [<c2370c73>] pci_device_add+0xe3/0x130
        [<c274640b>] pci_scan_single_device+0x8b/0xb0
        [<c2370d08>] pci_scan_slot+0x48/0x100
        [<c2371904>] pci_scan_child_bus+0x24/0xc0
        [<c262a7b0>] pci_acpi_scan_root+0x2c0/0x490
        [<c23b7203>] acpi_pci_root_add+0x312/0x42f
        ...
      
      The solution is to use ioremap() instead of phys_to_virt() to map the
      setup data into the kernel address space.
      
      [bhelgaas: changelog]
      Tested-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: stable@vger.kernel.org	# v3.8+
      65694c5a
  7. 13 4月, 2013 1 次提交
  8. 04 1月, 2013 2 次提交
    • G
      X86: drivers: remove __dev* attributes. · a18e3690
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitconst,
      and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Daniel Drake <dsd@laptop.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a18e3690
    • B
      x86/PCI: Remove unused pci_root_bus · b7869ba1
      Bjorn Helgaas 提交于
      pci_root_bus is unused, so remove all references to it.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      b7869ba1
  9. 27 12月, 2012 1 次提交
  10. 11 12月, 2012 1 次提交
  11. 06 12月, 2012 1 次提交
  12. 08 11月, 2012 1 次提交
  13. 06 7月, 2012 1 次提交
  14. 01 5月, 2012 2 次提交
    • B
      PCI: work around Stratus ftServer broken PCIe hierarchy · 284f5f9d
      Bjorn Helgaas 提交于
      A PCIe downstream port is a P2P bridge.  Its secondary interface is
      a link that should lead only to device 0 (unless ARI is enabled)[1], so
      we don't probe for non-zero device numbers.
      
      Some Stratus ftServer systems have a PCIe downstream port (02:00.0) that
      leads to both an upstream port (03:00.0) and a downstream port (03:01.0),
      and 03:01.0 has important devices below it:
      
        [0000:02]-+-00.0-[03-3c]--+-00.0-[04-09]--...
                                  \-01.0-[0a-0d]--+-[USB]
                                                  +-[NIC]
                                                  +-...
      
      Previously, we didn't enumerate device 03:01.0, so USB and the network
      didn't work.  This patch adds a DMI quirk to scan all device numbers,
      not just 0, below a downstream port.
      
      Based on a patch by Prarit Bhargava.
      
      [1] PCIe spec r3.0, sec 7.3.1
      
      CC: Myron Stowe <mstowe@redhat.com>
      CC: Don Dutile <ddutile@redhat.com>
      CC: James Paradis <james.paradis@stratus.com>
      CC: Matthew Wilcox <matthew.r.wilcox@intel.com>
      CC: Jesse Barnes <jbarnes@virtuousgeek.org>
      CC: Prarit Bhargava <prarit@redhat.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      284f5f9d
    • Y
      x86/PCI: merge pcibios_scan_root() and pci_scan_bus_on_node() · c57ca65a
      Yinghai Lu 提交于
      pcibios_scan_root() and pci_scan_bus_on_node() were almost identical,
      so this patch merges them.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      c57ca65a
  15. 07 1月, 2012 2 次提交
    • B
      x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus() · 2cd6975a
      Bjorn Helgaas 提交于
      x86 has two kinds of PCI root bus scanning:
      
      (1) ACPI-based, using _CRS resources.  This used pci_create_bus(), not
          pci_scan_bus(), because ACPI hotplug needed to split the
          pci_bus_add_devices() into a separate host bridge .start() method.
      
          This patch parses the _CRS resources earlier, so we can build a list of
          resources and pass it to pci_create_root_bus().
      
          Note that as before, we parse the _CRS even if we aren't going to use
          it so we can print it for debugging purposes.
      
      (2) All other, which used either default resources (ioport_resource and
          iomem_resource) or information read from the hardware via amd_bus.c or
          similar.  This used pci_scan_bus().
      
          This patch converts x86_pci_root_bus_res_quirks() (previously called
          from pcibios_fixup_bus()) to x86_pci_root_bus_resources(), which builds
          a list of resources before we call pci_scan_root_bus().
      
          We also use x86_pci_root_bus_resources() if we have ACPI but are
          ignoring _CRS.
      
      CC: Yinghai Lu <yinghai.lu@oracle.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2cd6975a
    • B
      x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented() · 46fbade0
      Bjorn Helgaas 提交于
      This doesn't change any functionality, but it makes a subsequent patch
      slightly simpler.
      
      pci_scan_bus(NULL, ...) and pci_scan_bus_parented() are identical except
      that pci_scan_bus() also calls pci_bus_add_devices():
      
        pci_scan_bus_parented
          pci_create_bus
          pci_scan_child_bus
      
        pci_scan_bus
          pci_create_bus
          pci_scan_child_bus
          pci_bus_add_devices
      
      All callers of pcibios_scan_root() call pci_bus_add_devices() explicitly,
      and we don't pass a parent device, so we might as well use pci_scan_bus().
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      46fbade0
  16. 15 10月, 2011 1 次提交
  17. 22 7月, 2011 1 次提交
  18. 15 1月, 2011 1 次提交
  19. 18 10月, 2010 1 次提交
  20. 31 7月, 2010 1 次提交
    • M
      x86/PCI: Add option to not assign BAR's if not already assigned · 7bd1c365
      Mike Habeck 提交于
      The Linux kernel assigns BARs that a BIOS did not assign, most likely
      to handle broken BIOSes that didn't enumerate the devices correctly.
      On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
      drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
      We purposely don't assign these I/O BARs because I/O Space is a very
      limited resource.  There is only 64k of I/O Space, and in a PCIe
      topology that space gets divided up into 4k chucks (this is due to
      the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
      Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
      
      SGI needs to scale to >16 devices with I/O BARs.  So by not assigning
      I/O BARs on devices we know don't use them, we can do that (iff the
      kernel doesn't go and assign these BARs that the BIOS purposely didn't
      assign).
      
      This patch will not assign a resource to a device BAR if that BAR was
      not assigned by the BIOS, and the kernel cmdline option 'pci=nobar'
      was specified.   This patch is closely modeled after the 'pci=norom'
      option that currently exists in the tree.
      Signed-off-by: NMike Habeck <habeck@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      7bd1c365
  21. 12 5月, 2010 1 次提交
  22. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  23. 24 2月, 2010 1 次提交
  24. 20 2月, 2010 1 次提交
    • T
      x86: Move pci init function to x86_init · b72d0db9
      Thomas Gleixner 提交于
      The PCI initialization in pci_subsys_init() is a mess. pci_numaq_init,
      pci_acpi_init, pci_visws_init and pci_legacy_init are called and each
      implementation checks and eventually modifies the global variable
      pcibios_scanned.
      
      x86_init functions allow us to do this more elegant. The pci.init
      function pointer is preset to pci_legacy_init. numaq, acpi and visws
      can modify the pointer in their early setup functions. The functions
      return 0 when they did the full initialization including bus scan. A
      non zero return value indicates that pci_legacy_init needs to be
      called either because the selected function failed or wants the
      generic bus scan in pci_legacy_init to happen (e.g. visws).
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80CFE@orsmsx508.amr.corp.intel.com>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b72d0db9
  25. 05 11月, 2009 2 次提交
    • D
      x86/PCI: Use generic cacheline sizing instead of per-vendor tests. · 76b1a87b
      Dave Jones 提交于
      Instead of the PCI code needing to have code to determine the
      cacheline size of each processor, use the data the cpu identification
      code should have already determined during early boot.
      
      (The vendor checks are also incomplete, and don't take into account
       modern CPUs)
      
      I've been carrying a variant of this code in Fedora for a while,
      that prints debug information.  There are a number of cases where we
      are currently setting the PCI cacheline size to 32 bytes, when the CPU
      cacheline size is 64 bytes.  With this patch, we set them both the same.
      Signed-off-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      76b1a87b
    • J
      PCI: determine CLS more intelligently · ac1aa47b
      Jesse Barnes 提交于
      Till now, CLS has been determined either by arch code or as
      L1_CACHE_BYTES.  Only x86 and ia64 set CLS explicitly and x86 doesn't
      always get it right.  On most configurations, the chance is that
      firmware configures the correct value during boot.
      
      This patch makes pci_init() determine CLS by looking at what firmware
      has configured.  It scans all devices and if all non-zero values
      agree, the value is used.  If none is configured or there is a
      disagreement, pci_dfl_cache_line_size is used.  arch can set the dfl
      value (via PCI_CACHE_LINE_BYTES or pci_dfl_cache_line_size) or
      override the actual one.
      
      ia64, x86 and sparc64 updated to set the default cls instead of the
      actual one.
      
      While at it, declare pci_cache_line_size and pci_dfl_cache_line_size
      in pci.h and drop private declarations from arch code.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid Miller <davem@davemloft.net>
      Acked-by: NGreg KH <gregkh@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      ac1aa47b
  26. 19 9月, 2009 1 次提交
    • J
      x86/PCI: make 32 bit NUMA node array int, not unsigned char · 76baeebf
      Jesse Barnes 提交于
      We use -1 to indicate no node affinity, so we need a signed type here or
      all sorts of bad things happen, like crashes in dev_attr_show as
      reported by Ingo:
      
      [  158.058140] warning: `dbus-daemon' uses 32-bit capabilities (legacy support in use)
      [  159.370562] BUG: unable to handle kernel NULL pointer dereference at (null)
      [  159.372694] IP: [<ffffffff8143b722>] bitmap_scnprintf+0x72/0xd0
      [  159.372694] PGD 71d3e067 PUD 7052e067 PMD 0
      [  159.372694] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  159.372694] last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/local_cpus
      [  159.372694] CPU 0
      [  159.372694] Pid: 7364, comm: irqbalance Not tainted 2.6.31-tip #8043 System Product Name
      [  159.372694] RIP: 0010:[<ffffffff8143b722>]  [<ffffffff8143b722>] bitmap_scnprintf+0x72/0xd0
      [  159.372694] RSP: 0018:ffff8800712a1e38  EFLAGS: 00010246
      [  159.372694] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [  159.372694] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff880077dc5000
      [  159.372694] RBP: ffff8800712a1e68 R08: 0000000000000001 R09: 0000000000000001
      [  159.372694] R10: ffffffff8215c47c R11: 0000000000000000 R12: 0000000000000000
      [  159.372694] R13: 0000000000000000 R14: 0000000000000ffe R15: ffff880077dc5000
      [  159.372694] FS:  00007f5f578f76f0(0000) GS:ffff880007000000(0000) knlGS:0000000000000000
      [  159.372694] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  159.372694] CR2: 0000000000000000 CR3: 0000000071a77000 CR4: 00000000000006f0
      [  159.372694] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  159.372694] DR3: ffffffff835109dc DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  159.372694] Process irqbalance (pid: 7364, threadinfo ffff8800712a0000, task ffff880070773000)
      [  159.372694] Stack:
      [  159.372694]  2222222222222222 ffff880077dc5000 fffffffffffffffb ffff88007d366b40
      [  159.372694] <0> ffff8800712a1f48 ffff88007d3840a0 ffff8800712a1e88 ffffffff8146332b
      [  159.372694] <0> fffffffffffffff4 ffffffff82450718 ffff8800712a1ea8 ffffffff815a9a1f
      [  159.372694] Call Trace:
      [  159.372694]  [<ffffffff8146332b>] local_cpus_show+0x3b/0x60
      [  159.372694]  [<ffffffff815a9a1f>] dev_attr_show+0x2f/0x60
      [  159.372694]  [<ffffffff8118ee6f>] sysfs_read_file+0xbf/0x1d0
      [  159.372694]  [<ffffffff8112afe9>] vfs_read+0xc9/0x180
      [  159.372694]  [<ffffffff8112c365>] sys_read+0x55/0x90
      [  159.372694]  [<ffffffff810114f2>] system_call_fastpath+0x16/0x1b
      [  159.372694] Code: 41 b9 01 00 00 00 44 8d 46 03 49 63 fc 0f 49 d3 c1 f8 1f 4c 01 ff c1 e8 1a c1 fa 06 41 c1 e8 02 8d 0c 03 48 63 d2 83 e1 3f 29 c1 <49> 8b 44 d5 00 48 c7 c2 8c 37 16 82 48 d3 e8 89 f1 44 89 f6 49
      [  159.372694] RIP  [<ffffffff8143b722>] bitmap_scnprintf+0x72/0xd0
      [  159.372694]  RSP <ffff8800712a1e38>
      [  159.372694] CR2: 0000000000000000
      [  159.600828] ---[ end trace 35550c356e84e60c ]---
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Tested-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      76baeebf
  27. 10 9月, 2009 1 次提交
    • J
      x86/PCI: initialize PCI bus node numbers early · 2547089c
      Jesse Barnes 提交于
      The current mp_bus_to_node array is initialized only by AMD specific
      code, since AMD platforms have registers that can be used for
      determining mode numbers.  On new Intel platforms it's necessary to
      initialize this array as well though, otherwise all PCI node numbers
      will be 0, when in fact they should be -1 (indicating that I/O isn't
      tied to any particular node).
      
      So move the mp_bus_to_node code into the common PCI code, and
      initialize it early with a default value of -1.  This may be overridden
      later by arch code (e.g. the AMD code).
      
      With this change, PCI consistent memory and other node specific
      allocations (e.g. skbuff allocs) should occur on the "current" node.
      If, for performance reasons, applications want to be bound to specific
      nodes, they should open their devices only after being pinned to the
      CPU where they'll run, for maximum locality.
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Tested-by: NJesse Brandeburg <jesse.brandeburg@gmail.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2547089c
  28. 25 6月, 2009 1 次提交
  29. 12 6月, 2009 1 次提交
  30. 23 4月, 2009 2 次提交