1. 15 2月, 2012 24 次提交
    • Y
      PCI: Separate pci_bus_read_dev_vendor_id from pci_scan_device · efdc87da
      Yinghai Lu 提交于
      We can reuse it for pciehp probing.
      
      -v2: according to Kenji, fix crs timeout checking, and export the function
           for later use when pciehp is compiled as a module.
      Suggested-by: NMatthew Wilcox <matthew@wil.cx>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      efdc87da
    • Y
      PCI: make sriov work with hotplug remove · ac205b7b
      Yinghai Lu 提交于
      When hot removing a pci express module that has a pcie switch and supports
      SRIOV, we got:
      
      [ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
      [ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
      [ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
      [ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
      [ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
      [ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
      [ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
      [ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
      [ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain:bus:device=0000:b0:00
      [ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
      [ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain:bus:dev = 0000:b0:00
      [ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
      [ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
      [ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
      [ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
      [ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
      [ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
      [ 5926.133377] PGD 0
      [ 5926.135402] Oops: 0000 [#1] SMP
      [ 5926.138659] CPU 2
      [ 5926.140499] Modules linked in:
      ...
      [ 5926.143754]
      [ 5926.275823] Call Trace:
      [ 5926.278267]  [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
      [ 5926.284180]  [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
      [ 5926.290264]  [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
      [ 5926.296866]  [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
      [ 5926.303123]  [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
      [ 5926.309206]  [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
      ...
      
       +-[0000:80]-+-00.0-[81-8f]--
       |           +-01.0-[90-9f]--
       |           +-02.0-[a0-af]--
       |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
       |           |                               |            +-00.1 Device
       |           |                               |            +-00.2 Device
       |           |                               |            \-00.3 Device
       |           |                               \-03.0-[b3]--+-00.0 Device
       |           |                                            +-00.1 Device
       |           |                                            +-00.2 Device
       |           |                                            \-00.3 Device
      
      root complex: 80:02.2
      pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
      end devices  are b2:00.0 and b3.00.0.
      VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3
      
      Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
      virt fn and remove the fn, so
      	list_for_each_safe(l, n, &bus->devices)
      will have problem to refer freed n that is pointed to vf entry.
      
      Solution is just replacing list_for_each_safe() with
      list_for_each_prev_safe().  This will make sure we can get valid n pointer
      to PF instead of the freed VF pointer (because newly added devices are
      inserted to the bus->devices list tail).
      
      During reviewing the patch, Bjorn said:
      |   The PCI hot-remove path calls pci_stop_bus_devices() via
      |   pci_remove_bus_device().
      |
      |   pci_stop_bus_devices() traverses the bus->devices list (point A below),
      |   stopping each device in turn, which calls the driver remove() method.  When
      |   the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
      |   also uses pci_remove_bus_device() to remove the VF devices from the
      |   bus->devices list (point B).
      |
      |       pci_remove_bus_device
      |         pci_stop_bus_device
      |           pci_stop_bus_devices(subordinate)
      |             list_for_each(bus->devices)             <-- A
      |               pci_stop_bus_device(PF)
      |                 ...
      |                   driver->remove
      |                     pci_disable_sriov
      |                       ...
      |                         pci_remove_bus_device(VF)
      |                             <remove from bus_list>  <-- B
      |
      |   At B, we're changing the same list we're iterating through at A, so when
      |   the driver remove() method returns, the pci_stop_bus_devices() iterator has
      |   a pointer to a list entry that has already been freed.
      
      Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141
      				 https://lkml.org/lkml/2012/1/23/360
      
      -v5: According to Linus to make remove more robust, Change to
           list_for_each_prev_safe instead. That is more reasonable, because
           those devices are added to tail of the list before.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      ac205b7b
    • Y
      PCI: remove add_to_failed_list() · 67cc7e26
      Yinghai Lu 提交于
      Only one user; just use add_to_list instead.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      67cc7e26
    • Y
      PCI: add debug print out for add_size · b592443d
      Yinghai Lu 提交于
      For use in debugging resource reallocation.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      b592443d
    • Y
      PCI: make free_list() into a function · bffc56d4
      Yinghai Lu 提交于
      After merging struct pci_dev_resource_x and pci_dev_resource,
      We can use a function instead of macro now.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      bffc56d4
    • Y
      PCI: Rename dev_res_x to add_res or fail_res · b9b0bba9
      Yinghai Lu 提交于
      Linus says don't use dev_res_x because it doesn't communicate anything
      about usage.  Rename them to add_res or fail_res etc according to
      context.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      b9b0bba9
    • Y
      PCI: Merge pci_dev_resource_x and pci_dev_resource · 764242a0
      Yinghai Lu 提交于
      pci_dev_resource_x is a superset of pci_dev_resource and they're just
      temp structs used during resource reallocation.
      
      pci_dev_resource usage is quite limted.
      
      So just use pci_dev_resource_x, and rename it as new pci_dev_resource.
      
      -v2: According to Linus, Separate free_list change to another patch
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      764242a0
    • Y
      PCI: Replace resource_list with generic list · bdc4abec
      Yinghai Lu 提交于
      So we can use helper functions for generic list.  This makes the
      resource re-allocation code much more readable.
      
      -v2: Use list_add_tail instead of adding list_insert_before, Pointed out
           by Linus.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      bdc4abec
    • Y
      PCI: Move struct resource_list to setup-bus.c · 2934a0de
      Yinghai Lu 提交于
      No user outside of setup-bus.c now.  Later patches will convert
      resource_list to a regular list.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2934a0de
    • Y
      PCI: Move pdev_sort_resources() to setup-bus.c · 78c3b329
      Yinghai Lu 提交于
      This allows us to move the definition of struct resource_list to
      setup_bus.c and later convert resource_list to a regular list.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      78c3b329
    • Y
      PCI: make re-allocation try harder by reassigning ranges higher in the heirarchy · 19aa7ee4
      Yinghai Lu 提交于
      On a system with devices that support SRIOV connected to a pcie switch
      to pcie root port:
      
       +-[0000:80]-+-00.0-[81-8f]--
       |           +-01.0-[90-9f]--
       |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device 207a
       |           |                               \-03.0-[a3]--+-00.0 Oracle Corporation Device 207a
       |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Oracle Corporation Device 207a
       |           |                               \-03.0-[b3]--+-00.0 Oracle Corporation Device 207a
      
      When the BIOS does not assign resources for SRIOV BARs, kernel pci
      reallocation only goes up one bridge and then gives up, failing to to
      get resources for all sSRIOV BARs, even though the range is large enough
      in the peer root bus.
      
      Specifically, only the bridge at the a1:02.0 level has its resources
      cleared and reallocated.  The kernel does not go up to clear the bridge
      at the 80:02.0 level.
      
      To make it go to upper levels, during retry, we need to treat "good to have"
      resources as "must have".
      
      Only on the last try will we treat good to have resources as optional.
      At that time, parent bridge resources will already have been released so
      we'll have a chance to get everything assigned with must_have plus
      good_to_have for all child devices.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      19aa7ee4
    • Y
      PCI: Make pci_rescan_bus handle add_list · 9b03088f
      Yinghai Lu 提交于
      This allows us to allocate resources to hotplug bridges during
      remove/rescan.
      
      We need to move the function to setup-bus.c so it can use
      __pci_bus_size_bridges and __pci_bus_assign_resources directly to take
      the add_list resource tracking list.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      9b03088f
    • Y
      PCI: Make rescan bus increase bridge resource size if needed · 2f320521
      Yinghai Lu 提交于
      Current rescan will not touch bridge MMIO and IO.
      
      Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
      resources, if child devices need more resources.
      
      Only do that for bridges whose children are all removed already; i.e. don't
      release resources that could already be in use by drivers on child devices.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2f320521
    • Y
      PCI: Use add_list in pcie hotplug path. · 8424d759
      Yinghai Lu 提交于
      We need add size for hot plug path when pluging in hotplug chassis
      without cards.
      
      -v2: change descriptions. make it applicable after "pci: Check bridge
           resources after resource allocation."
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      8424d759
    • Y
      PCI: try to assign required+option size first · 3e6e0d80
      Yinghai Lu 提交于
      We found reassignment can not find a range for one resource, even if the
      total available range is large enough.
      
      bridge b1:02.0 will need 2M+3M
      bridge b1:03.0 will need 2M+3M
      
      so bridge b0:00.0 will get assigned: 4M : [f8000000-f83fffff]
         later is reassigned to 10M : [f8000000-f9ffffff]
      
      b1:02.0 is assigned to 2M : [f8000000-f81fffff]
      b1:03.0 is assigned to 2M : [f8200000-f83fffff]
      
      After that b1:03.0 get chance to be reassigned to [f8200000-f86fffff],
      but b1:02.0 will not have chance to expand, because b1:03.0 is using in
      middle one.
      
      [  187.911401] pci 0000:b1:02.0: bridge window [mem 0x00100000-0x002fffff] to [bus b2-b2] add_size 300000
      [  187.920764] pci 0000:b1:03.0: bridge window [mem 0x00100000-0x002fffff] to [bus b3-b3] add_size 300000
      [  187.930129] pci 0000:b1:02.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
      [  187.938500] pci 0000:b1:03.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
      [  187.946857] pci 0000:b0:00.0: bridge window [mem 0x00100000-0x004fffff] to [bus b1-b3] add_size 600000
      [  187.956206] pci 0000:b0:00.0: BAR 14: assigned [mem 0xf8000000-0xf83fffff]
      [  187.963102] pci 0000:b0:00.0: BAR 15: assigned [mem 0xf5000000-0xf51fffff pref]
      [  187.970434] pci 0000:b0:00.0: BAR 14: reassigned [mem 0xf8000000-0xf89fffff]
      [  187.977497] pci 0000:b1:02.0: BAR 14: assigned [mem 0xf8000000-0xf81fffff]
      [  187.984383] pci 0000:b1:02.0: BAR 15: assigned [mem 0xf5000000-0xf50fffff pref]
      [  187.991695] pci 0000:b1:03.0: BAR 14: assigned [mem 0xf8200000-0xf83fffff]
      [  187.998576] pci 0000:b1:03.0: BAR 15: assigned [mem 0xf5100000-0xf51fffff pref]
      [  188.005888] pci 0000:b1:03.0: BAR 14: reassigned [mem 0xf8200000-0xf86fffff]
      [  188.012939] pci 0000:b1:02.0: BAR 14: can't assign mem (size 0x200000)
      [  188.019471] pci 0000:b1:02.0: failed to add 300000 to res=[mem 0xf8000000-0xf81fffff]
      [  188.027326] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.034071] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.040795] pci 0000:b2:00.0: BAR 2: assigned [mem 0xf8000000-0xf80fffff 64bit]
      [  188.048119] pci 0000:b2:00.0: BAR 2: set to [mem 0xf8000000-0xf80fffff 64bit] (PCI address [0xf8000000-0xf80fffff])
      [  188.058550] pci 0000:b2:00.0: BAR 6: assigned [mem 0xf5000000-0xf50fffff pref]
      [  188.065802] pci 0000:b2:00.0: BAR 0: assigned [mem 0xf8100000-0xf8103fff 64bit]
      [  188.073125] pci 0000:b2:00.0: BAR 0: set to [mem 0xf8100000-0xf8103fff 64bit] (PCI address [0xf8100000-0xf8103fff])
      [  188.083596] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.090310] pci 0000:b2:00.0: BAR 9: can't assign mem (size 0x300000)
      [  188.096773] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.103479] pci 0000:b2:00.0: BAR 7: assigned [mem 0xf8104000-0xf810ffff 64bit]
      [  188.110801] pci 0000:b2:00.0: BAR 7: set to [mem 0xf8104000-0xf810ffff 64bit] (PCI address [0xf8104000-0xf810ffff])
      [  188.121256] pci 0000:b1:02.0: PCI bridge to [bus b2-b2]
      [  188.126512] pci 0000:b1:02.0:   bridge window [mem 0xf8000000-0xf81fffff]
      [  188.133328] pci 0000:b1:02.0:   bridge window [mem 0xf5000000-0xf50fffff pref]
      [  188.140608] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.147341] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.154076] pci 0000:b3:00.0: BAR 2: assigned [mem 0xf8200000-0xf82fffff 64bit]
      [  188.161417] pci 0000:b3:00.0: BAR 2: set to [mem 0xf8200000-0xf82fffff 64bit] (PCI address [0xf8200000-0xf82fffff])
      [  188.171865] pci 0000:b3:00.0: BAR 6: assigned [mem 0xf5100000-0xf51fffff pref]
      [  188.179090] pci 0000:b3:00.0: BAR 0: assigned [mem 0xf8300000-0xf8303fff 64bit]
      [  188.186431] pci 0000:b3:00.0: BAR 0: set to [mem 0xf8300000-0xf8303fff 64bit] (PCI address [0xf8300000-0xf8303fff])
      [  188.196884] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.203591] pci 0000:b3:00.0: BAR 9: assigned [mem 0xf8400000-0xf86fffff 64bit]
      [  188.210909] pci 0000:b3:00.0: BAR 9: set to [mem 0xf8400000-0xf86fffff 64bit] (PCI address [0xf8400000-0xf86fffff])
      [  188.221379] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.228089] pci 0000:b3:00.0: BAR 7: assigned [mem 0xf8304000-0xf830ffff 64bit]
      [  188.235407] pci 0000:b3:00.0: BAR 7: set to [mem 0xf8304000-0xf830ffff 64bit] (PCI address [0xf8304000-0xf830ffff])
      [  188.245843] pci 0000:b1:03.0: PCI bridge to [bus b3-b3]
      [  188.251107] pci 0000:b1:03.0:   bridge window [mem 0xf8200000-0xf86fffff]
      [  188.257922] pci 0000:b1:03.0:   bridge window [mem 0xf5100000-0xf51fffff pref]
      [  188.265180] pci 0000:b0:00.0: PCI bridge to [bus b1-b3]
      [  188.270443] pci 0000:b0:00.0:   bridge window [mem 0xf8000000-0xf89fffff]
      [  188.277250] pci 0000:b0:00.0:   bridge window [mem 0xf5000000-0xf51fffff pref]
      [  188.284512] pcieport 0000:80:02.2: PCI bridge to [bus b0-bf]
      [  188.290184] pcieport 0000:80:02.2:   bridge window [io  0xa000-0xbfff]
      [  188.296735] pcieport 0000:80:02.2:   bridge window [mem 0xf8000000-0xf8ffffff]
      [  188.303963] pcieport 0000:80:02.2:   bridge window [mem 0xf5000000-0xf5ffffff 64bit pref]
      
      Thus b2:00.0 BAR 9 does not get assigned...
      
      root cause:
      b1:02.0 can not be added more range, because b1:03.0 is just after it;
      no space between the required ranges.
      
      Solution:
      Try to assign required + optional all together at first, and if that
      fails, try again with just the required resources.
      
      -v2: seperate add_to_list change() to another patch according to Jesse.
           seperate get_res_add_size() moving to another patch according to Jesse.
           add !realloc_head->next check if the list is empty to bail early
           according to Jesse.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      3e6e0d80
    • Y
      PCI: Move get_res_add_size() function · 1c372353
      Yinghai Lu 提交于
      Need to call it from __assign_resources_sorted() later and we'd like to
      avoid a forward declaraion.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      1c372353
    • Y
      PCI: Make add_to_list() return status · ef62dfef
      Yinghai Lu 提交于
      Will be used for resource_list_x duplication when trying
      requested+optional at first.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      ef62dfef
    • Y
      PCI : Calculate right add_size · a4ac9fea
      Yinghai Lu 提交于
      During debug of one SRIOV enabled hotplug device, we found found that
      add_size is not passed properly.
      
      The device has devices under two level bridges:
      
       +-[0000:80]-+-00.0-[81-8f]--
       |           +-01.0-[90-9f]--
       |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0  Oracle Corporation Device
       |           |                               \-03.0-[a3]--+-00.0  Oracle Corporation Device
      
      Which means later the parent bridge will not try to add a big enough range:
      
      [  557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
      [  557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
      [  557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
      [  557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
      [  557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
      [  557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
      [  557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
      [  557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
      [  557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
      [  557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]
      
      It turns out we did not calculate size1 properly.
      
      static resource_size_t calculate_memsize(resource_size_t size,
                      resource_size_t min_size,
                      resource_size_t size1,
                      resource_size_t old_size,
                      resource_size_t align)
      {
              if (size < min_size)
                      size = min_size;
              if (old_size == 1 )
                      old_size = 0;
              if (size < old_size)
                      size = old_size;
              size = ALIGN(size + size1, align);
              return size;
      }
      
      We should not pass add_size with min_size in calculate_memsize since
      that will make add_size not contribute final add_size.
      
      So just pass add_size with size1 to calculate_memsize().
      
      With this change, we should have chance to remove extra addon in
      pci_reassign_resource.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      a4ac9fea
    • M
      PCI: Fix typo in setup-res.c · 0dea210b
      Masanari Iida 提交于
      Correct spelling "resouce" to "resource" in
      dricers/pci/setup-res.c
      Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      0dea210b
    • K
      PCI: Introduce __pci_reset_function_locked to be used when holding device_lock. · 6fbf9e7a
      Konrad Rzeszutek Wilk 提交于
      The use case of this is when a driver wants to call FLR when a device
      is attached to it using the SysFS "bind" or "unbind" functionality.
      
      The call chain when a user does "bind" looks as so:
      
       echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind
      
      and ends up calling:
        driver_bind:
          device_lock(dev);  <=== TAKES LOCK
          XXXX_probe:
               .. pci_enable_device()
               ...__pci_reset_function(), which calls
                       pci_dev_reset(dev, 0):
                              if (!0) {
                                      device_lock(dev) <==== DEADLOCK
      
      The __pci_reset_function_locked function allows the the drivers
      'probe' function to call the "pci_reset_function" while still holding
      the driver mutex lock.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      6fbf9e7a
    • J
      PCI: drivers/pci/hotplug/ibmphp_ebda.c: add missing iounmap · 8f0cdddc
      Julia Lawall 提交于
      Add missing iounmap in error handling code, in a case where the function
      already preforms iounmap on some other execution path.
      
      A simplified version of the semantic match that finds this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression e;
      statement S,S1;
      int ret;
      @@
      e = \(ioremap\|ioremap_nocache\)(...)
      ... when != iounmap(e)
      if (<+...e...+>) S
      ... when any
          when != iounmap(e)
      *if (...)
         { ... when != iounmap(e)
           return ...; }
      ... when any
      iounmap(e);
      // </smpl>
      Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      8f0cdddc
    • A
      PCI: Can continually add funcs after adding func0 · f382a086
      Amos Kong 提交于
      Boot up a KVM guest, and hotplug multifunction
      devices(func1,func2,func0,func3) to guest.
      
      for i in 1 2 0 3;do
      qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
      (qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
      (qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
      done
      
      In linux kernel, when func0 of the slot is hot-added, the whole
      slot will be marked as 'enabled', then driver will ignore other new
      hotadded funcs.
      But in Win7 & WinXP, we can continaully add other funcs after adding
      func0, all funcs will be added in guest.
      
      drivers/pci/hotplug/acpiphp_glue.c:
      static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
      {
      ....
              for (slot = bridge->slots; slot; slot = slot->next) {
                      if (slot->flags & SLOT_ENABLED) {
                              acpiphp_disable_slot()
                      else
                              acpiphp_enable_slot()
      ....                              |
      }                                 v
                                  enable_device()
                                        |
                                        v
              //only don't enable slot if func0 is not added
      	list_for_each_entry(func, &slot->funcs, sibling) {
                     ...
              }
             slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'
      
      This patch just make pci driver can continaully add funcs after adding
      func 0. Only mark slot to 'enabled' when all funcs are added.
      
      For pci multifunction hotplug, we can add functions one by one(func 0 is
      necessary), and all functions will be removed in one time.
      Signed-off-by: NAmos Kong <akong@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      f382a086
    • M
      x86/PCI: Convert maintaining FW-assigned BIOS BAR values to use a list · 6535943f
      Myron Stowe 提交于
      This patch converts the underlying maintenance aspects of FW-assigned
      BIOS BAR values from a statically allocated array within struct pci_dev
      to a list of temporary, stand alone, entries.
      Signed-off-by: NMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      6535943f
    • M
      PCI: Fix starting basis for resource requests · 351fc6d1
      Myron Stowe 提交于
      pci_revert_fw_address() is used to reinstate a PCI device's original
      FW-assigned BIOS BAR value(s) if normal resource assignment fails.
      
      When attempting to reinstate an address, the point within the resource
      tree from which to attempt the new resource request should be the parent
      resource corresponding to the device, not the base of the resource tree
      (ioport_resource or iomem_resource).  For PCI devices this would
      typically be the resource corresponding to the upstream PCI host bridge
      or P2P bridge aperture.
      
      This patch sets the point within the resource tree to attempt a new
      resource assignment request to the PCI device's parent resource and only
      if that fails does it fall back to the base ioport_resource or
      iomem_resource.
      Signed-off-by: NMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      351fc6d1
  2. 11 2月, 2012 3 次提交
    • Y
      PCI: Fix pci cardbus removal · 3682a394
      Yinghai Lu 提交于
      During test busn_res allocation with cardbus, found pci card removal is not
      working anymore, and it turns out it is broken by:
      
      |commit 79cc9601
      |Date:   Tue Nov 22 21:06:53 2011 -0800
      |
      |    PCI: Only call pci_stop_bus_device() one time for child devices at remove
      
      The above changed the behavior of pci_remove_behind_bridge that
      yenta_cardbus depended on.  So restore the old behavoir of
      pci_remove_behind_bridge (which requires stopping and removing of all
      devices) by:
      
      1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
         __pci_remove_bus_device() call it instead.
      2. add pci_stop_behind_bridge that will stop devices behind a bridge
      3. add back pci_remove_behind_bridge that will stop and remove devices
         under bridge.
      
      -v2: update commit description a little bit.
      Tested-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      3682a394
    • V
      PCI: set pci sriov page size before reading SRIOV BAR · 8161fe91
      Vaidyanathan Srinivasan 提交于
      For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
      the PCI_SRIOV_BAR are queried.  The sys pagesize defaults to 4k,
      so this change is required on powerpc box with 64k base page size.
          
      This is a regression caused due to moving SRIOV init to sriov_enable().
          
      | commit afd24ece
      | Author: Ram Pai <linuxram@us.ibm.com>
          
      | PCI: delay configuration of SRIOV capability
      | The SRIOV capability, namely page size and total_vfs of a device are
      | configured during enumeration phase of the device.  This can potentially
      | interfere with the PCI operations of the platform, if the IOV capability
      | of the device is not enabled.
      Signed-off-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Acked-by: NRam Pai <linuxram@us.ibm.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      8161fe91
    • Y
      PCI: workaround hard-wired bus number V2 · 71f6bd4a
      Yinghai Lu 提交于
      Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
      when using ACPI resources (_CRS).
      This is default, a manual workaround (without this patch)
      would be pci=nocrs boot param.
      
      V2: Add dev_warn if the workaround is hit. This should reveal
      how common such setups are (via google) and point to possible
      problems if things are still not working as expected.
      -> Suggested by Jan Beulich.
      
      Cc: stable@vger.kernel.org
      Tested-by: garyhade@us.ibm.com
      Signed-off-by: NYinghai Lu <yinghai.lu@oracle.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      71f6bd4a
  3. 24 1月, 2012 1 次提交
    • R
      kernel-doc: fix new warnings in pci · 6e9292c5
      Randy Dunlap 提交于
      Fix new kernel-doc warnings:
      
      Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
      Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
      Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
      Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
      Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
      Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e9292c5
  4. 13 1月, 2012 1 次提交
  5. 07 1月, 2012 11 次提交
    • K
      x86/PCI: Expand the x86_msi_ops to have a restore MSIs. · 76ccc297
      Konrad Rzeszutek Wilk 提交于
      The MSI restore function will become a function pointer in an
      x86_msi_ops struct. It defaults to the implementation in the
      io_apic.c and msi.c. We piggyback on the indirection mechanism
      introduced by "x86: Introduce x86_msi_ops".
      
      Cc: x86@kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: linux-pci@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      76ccc297
    • H
      PCI: Enable ATS at the device state restore · 1900ca13
      Hao, Xudong 提交于
      During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
      This patch enables ATS at the device state restore if PCI device has ATS
      capability.
      Signed-off-by: NXudong Hao <xudong.hao@intel.com>
      Signed-off-by: NXiantao Zhang <xiantao.zhang@intel.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      1900ca13
    • N
      PCI: msi: fix imbalanced refcount of msi irq sysfs objects · 424eb391
      Neil Horman 提交于
      This warning was recently reported to me:
      
      ------------[ cut here ]------------
      WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
      Hardware name: VMware Virtual Platform
      kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
      being called.
      Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
      vmw_pvscsi
      Pid: 630, comm: modprobe Tainted: G        W   3.1.6-1.fc16.x86_64 #1
      Call Trace:
       [<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0
       [<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff810da293>] ? free_desc+0x63/0x70
       [<ffffffff812a9aa0>] kobject_put+0x50/0x60
       [<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120
       [<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0
       [<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
       [<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
       [<ffffffff812d141c>] local_pci_probe+0x5c/0xd0
       [<ffffffff812d2cb9>] pci_device_probe+0x109/0x130
       [<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0
       [<ffffffff8138bceb>] __driver_attach+0xab/0xb0
       [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
       [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
       [<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90
       [<ffffffff8138b63e>] driver_attach+0x1e/0x20
       [<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0
       [<ffffffffa0188000>] ? 0xffffffffa0187fff
       [<ffffffff8138c246>] driver_register+0x76/0x140
       [<ffffffff815ca414>] ? printk+0x51/0x53
       [<ffffffffa0188000>] ? 0xffffffffa0187fff
       [<ffffffff812d2996>] __pci_register_driver+0x56/0xd0
       [<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
       [<ffffffff81002042>] do_one_initcall+0x42/0x180
       [<ffffffff810aad71>] sys_init_module+0x91/0x200
       [<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b
      ---[ end trace 44593438a59a9558 ]---
      Using INTx interrupt, #Rx queues: 1.
      
      It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
      be called.  Because populate_msi_sysfs fails, we never registered any of the
      msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
      on each of them, which gets flagged in the above stack trace.
      
      The fix is pretty straightforward.  We can key of the parent pointer in the
      kobject.  It is only set if the kobject_init_and_add succededs in
      populate_msi_sysfs.  If anything fails there, each kobject has its parent reset
      to NULL
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Bjorn Helgaas <bhelgaas@google.com>
      CC: Greg Kroah-Hartman <gregkh@suse.de>
      CC: linux-pci@vger.kernel.org
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      424eb391
    • P
      PCI: kconfig: English typo in pci/pcie/Kconfig · d56641c7
      P. Christeas 提交于
      Just fix this help text.
      Signed-off-by: NP. Christeas <xrg@linux.gr>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      d56641c7
    • V
      PCI/PM/Runtime: make PCI traces quieter · 85b8582d
      Vincent Palatin 提交于
      When the runtime PM is activated on PCI, if a device switches state
      frequently (e.g. an EHCI controller with autosuspending USB devices
      connected) the PCI configuration traces might be very verbose in the
      kernel log.  Let's guard those traces with DEBUG condition.
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: NVincent Palatin <vpalatin@chromium.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      85b8582d
    • B
      PCI: remove pci_create_bus() · 118faafa
      Bjorn Helgaas 提交于
      All users of pci_create_bus() have been converted to pci_create_root_bus(),
      so remove it.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      118faafa
    • B
      PCI: deprecate pci_scan_bus_parented() · 7e00fe2e
      Bjorn Helgaas 提交于
      Users of pci_scan_bus_parented() should be converted to use either
          pci_scan_root_bus() (preferred, but also calls pci_bus_add_devices)
      or
          pci_create_root_bus()
          pci_scan_child_bus()
      
      Since pci_scan_bus_parented(), I'm marking it deprecated now and will
      actually remove it later.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      7e00fe2e
    • B
      PCI: convert pci_scan_bus_parented() to use pci_create_root_bus() · 1e39ae9f
      Bjorn Helgaas 提交于
      This converts pci_scan_bus_parented() to use pci_create_root_bus()
      instead of pci_create_bus().  The new bus still has the default (incorrect)
      resources, so this patch doesn't help fix that problem, but it does remove
      one more use of pci_create_bus().
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      1e39ae9f
    • B
      PCI: convert pci_scan_bus() to use pci_create_root_bus() · de4b2f76
      Bjorn Helgaas 提交于
      I plan to deprecate pci_scan_bus_parented(), so use pci_create_root_bus()
      directly instead.  pci_scan_bus() itself will be removed as soon as all
      callers are gone, so this is just an interim step.
      
      v2: export pci_scan_bus
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      de4b2f76
    • B
      PCI: add pci_scan_root_bus() that accepts resource list · a2ebb827
      Bjorn Helgaas 提交于
      "Early" and "header" quirks often use incorrect bus resources because they
      see the default resources assigned by pci_create_bus(), before the
      architecture fixes them up (typically in pcibios_fixup_bus()).  Regions
      reserved by these quirks end up with the wrong parents.
      
      Here's the standard path for scanning a PCI root bus:
      
        pci_scan_bus or pci_scan_bus_parented
          pci_create_bus                     <-- A create with default resources
          pci_scan_child_bus
            pci_scan_slot
              pci_scan_single_device
                pci_scan_device
                  pci_setup_device
                    pci_fixup_device(early)  <-- B
                pci_device_add
                  pci_fixup_device(header)   <-- C
            pcibios_fixup_bus                <-- D fill in correct resources
      
      Early and header quirks at B and C use the default (incorrect) root bus
      resources rather than those filled in at D.
      
      This patch adds a new pci_scan_root_bus() function that sets the bus
      resources correctly from a supplied list of resources.
      
      I intend to remove pci_scan_bus() and pci_scan_bus_parented() after
      fixing all callers.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      a2ebb827
    • B
      PCI: add pci_create_root_bus() that accepts resource list · 166c6370
      Bjorn Helgaas 提交于
      pci_create_bus() assigns ioport_resource and iomem_resource as the default
      bus resources, i.e., the entire address space.  Architectures fix these
      later, typically in pcibios_fixup_bus() or after pci_scan_bus_parented()
      returns, but code that runs in the interim sees incorrect resource
      information.
      
      This patch adds a new pci_create_root_bus() that sets the bus resources
      correctly from a supplied list of resources.
      
      I intend to remove pci_create_bus() after changing all callers.
      
      Based on original patch by Deng-Cheng Zhu.
      
      Reference: http://www.spinics.net/lists/mips/msg41654.html
      Reference: https://lkml.org/lkml/2011/8/26/88Signed-off-by: NDeng-Cheng Zhu <dczhu@mips.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      166c6370