提交 · a47e0054253f69f2036d5dbce81f6c1cff9f0eb4 · openanolis / cloud-kernel

17 5月, 2019 1 次提交

PCI: hv: Fix a memory leak in hv_eject_device_work() · a47e0054

由 Dexuan Cui 提交于 3月 04, 2019

commit 05f151a73ec2b23ffbff706e5203e729a995cdc2 upstream.

When a device is created in new_pcichild_device(), hpdev->refs is set
to 2 (i.e. the initial value of 1 plus the get_pcichild()).

When we hot remove the device from the host, in a Linux VM we first call
hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and
then schedules a work of hv_eject_device_work(), so hpdev->refs becomes
3 (let's ignore the paired get/put_pcichild() in other places). But in
hv_eject_device_work(), currently we only call put_pcichild() twice,
meaning the 'hpdev' struct can't be freed in put_pcichild().

Add one put_pcichild() to fix the memory leak.

The device can also be removed when we run "rmmod pci-hyperv". On this
path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()),
hpdev->refs is 2, and we do correctly call put_pcichild() twice in
pci_devices_present_work().

Fixes: 4daace0d ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: NDexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: commit log rework]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a47e0054

22 9月, 2018 1 次提交

PCI: hv: Fix return value check in hv_pci_assign_slots() · 54be5b8c

由 Wei Yongjun 提交于 9月 21, 2018

In case of error, the function pci_create_slot() returns ERR_PTR() and
never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR().

Fixes: a15f2c08 ("PCI: hv: support reporting serial number as slot information")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54be5b8c

17 9月, 2018 1 次提交

PCI: hv: support reporting serial number as slot information · a15f2c08

由 Stephen Hemminger 提交于 9月 14, 2018

The Hyper-V host API for PCI provides a unique "serial number" which
can be used as basis for sysfs PCI slot table. This can be useful
for cases where userspace wants to find the PCI device based on
serial number.

When an SR-IOV NIC is added, the host sends an attach message
with serial number. The kernel doesn't use the serial number, but
it is useful when doing the same thing in a userspace driver such
as the DPDK. By having /sys/bus/pci/slots/N it provides a direct
way to find the matching PCI device.

There maybe some cases where serial number is not unique such
as when using GPU's. But the PCI slot infrastructure will handle
that.

This has a side effect which may also be useful. The common udev
network device naming policy uses the slot information (rather
than PCI address).
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a15f2c08

05 8月, 2018 1 次提交

x86: Don't include linux/irq.h from asm/hardirq.h · 447ae316

由 Nicolai Stange 提交于 7月 29, 2018

The next patch in this series will have to make the definition of
irq_cpustat_t available to entering_irq().

Inclusion of asm/hardirq.h into asm/apic.h would cause circular header
dependencies like

  asm/smp.h
    asm/apic.h
      asm/hardirq.h
        linux/irq.h
          linux/topology.h
            linux/smp.h
              asm/smp.h

or

  linux/gfp.h
    linux/mmzone.h
      asm/mmzone.h
        asm/mmzone_64.h
          asm/smp.h
            asm/apic.h
              asm/hardirq.h
                linux/irq.h
                  linux/irqdesc.h
                    linux/kobject.h
                      linux/sysfs.h
                        linux/kernfs.h
                          linux/idr.h
                            linux/gfp.h

and others.

This causes compilation errors because of the header guards becoming
effective in the second inclusion: symbols/macros that had been defined
before wouldn't be available to intermediate headers in the #include chain
anymore.

A possible workaround would be to move the definition of irq_cpustat_t
into its own header and include that from both, asm/hardirq.h and
asm/apic.h.

However, this wouldn't solve the real problem, namely asm/harirq.h
unnecessarily pulling in all the linux/irq.h cruft: nothing in
asm/hardirq.h itself requires it. Also, note that there are some other
archs, like e.g. arm64, which don't have that #include in their
asm/hardirq.h.

Remove the linux/irq.h #include from x86' asm/hardirq.h.

Fix resulting compilation errors by adding appropriate #includes to *.c
files as needed.

Note that some of these *.c files could be cleaned up a bit wrt. to their
set of #includes, but that should better be done from separate patches, if
at all.
Signed-off-by: NNicolai Stange <nstange@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

447ae316

10 7月, 2018 1 次提交

PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg() · 35a88a18

由 Dexuan Cui 提交于 7月 09, 2018

Commit de0aa7b2 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
uses local_bh_disable()/enable(), because hv_pci_onchannelcallback() can
also run in tasklet context as the channel event callback, so bottom halves
should be disabled to prevent a race condition.

With CONFIG_PROVE_LOCKING=y in the recent mainline, or old kernels that
don't have commit f71b74bc ("irq/softirqs: Use lockdep to assert IRQs
are disabled/enabled"), when the upper layer IRQ code calls
hv_compose_msi_msg() with local IRQs disabled, we'll see a warning at the
beginning of __local_bh_enable_ip():

  IRQs not enabled as expected
    WARNING: CPU: 0 PID: 408 at kernel/softirq.c:162 __local_bh_enable_ip

The warning exposes an issue in de0aa7b2: local_bh_enable() can
potentially call do_softirq(), which is not supposed to run when local IRQs
are disabled. Let's fix this by using local_irq_save()/restore() instead.

Note: hv_pci_onchannelcallback() is not a hot path because it's only called
when the PCI device is hot added and removed, which is infrequent.

Fixes: de0aa7b2 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>

35a88a18

29 6月, 2018 1 次提交

PCI: hv: Replace GFP_ATOMIC with GFP_KERNEL in new_pcichild_device() · 7403bd14

由 Jia-Ju Bai 提交于 3月 18, 2018

new_pcichild_device() is not called in atomic context.

The call chain ending up at new_pcichild_device() is:
[1] new_pcichild_device() <- pci_devices_present_work()
pci_devices_present_work() is only set in INIT_WORK().

Despite never getting called from atomic context,
new_pcichild_device() calls kzalloc with GFP_ATOMIC,
which waits busily for allocation.

GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL
to avoid busy waiting.
Signed-off-by: NJia-Ju Bai <baijiaju1990@gmail.com>
[lorenzo.pieralisi@arm.com: reworked commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>

7403bd14

08 6月, 2018 1 次提交

PCI: Collect all native drivers under drivers/pci/controller/ · 6e0832fa

由 Shawn Lin 提交于 5月 31, 2018

Native PCI drivers for root complex devices were originally all in
drivers/pci/host/.  Some of these devices can also be operated in endpoint
mode.  Drivers for endpoint mode didn't seem to fit in the "host"
directory, so we put both the root complex and endpoint drivers in
per-device directories, e.g., drivers/pci/dwc/, drivers/pci/cadence/, etc.

These per-device directories contain trivial Kconfig and Makefiles and
clutter drivers/pci/.  Make a new drivers/pci/controllers/ directory and
collect all the device-specific drivers there.

No functional change intended.

Link: https://lkml.kernel.org/r/1520304202-232891-1-git-send-email-shawn.lin@rock-chips.comSigned-off-by: NShawn Lin <shawn.lin@rock-chips.com>
[bhelgaas: changelog]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

6e0832fa

25 5月, 2018 1 次提交

PCI: hv: Do not wait forever on a device that has disappeared · c3635da2

由 Dexuan Cui 提交于 5月 23, 2018

Before the guest finishes the device initialization, the device can be
removed anytime by the host, and after that the host won't respond to
the guest's request, so the guest should be prepared to handle this
case.

Add a polling mechanism to detect device presence.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: edited commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>

c3635da2

24 5月, 2018 3 次提交

PCI: hv: Use list_for_each_entry() · 5b8db8f6

由 Stephen Hemminger 提交于 5月 23, 2018

There are several places where list_for_each_entry() could be
used to simplify the code.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>

5b8db8f6

PCI: hv: Convert remove_lock to refcount · 6708be93

由 Stephen Hemminger 提交于 5月 23, 2018

Use refcount instead of atomic for the reference counting
on bus. Refcount is safer because it handles overflow correctly.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
[lorenzo.pieralisi@arm.com: updated commit subject]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>

6708be93

PCI: hv: Remove unused reason for refcount handler · 8c99e120

由 Stephen Hemminger 提交于 5月 23, 2018

The get/put functions were taking a reason code. This appears to be
a debug infrastructure that is no longer used.

Move the functions to start of file to eliminate need for
forward declaration. Forward declarations are discouraged on
Linux.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
[lorenzo.pieralisi@arm.com: updated commit subject]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>

8c99e120

02 5月, 2018 1 次提交

PCI: hv: Make sure the bus domain is really unique · 29927dfb

由 Sridhar Pitchai 提交于 5月 01, 2018

When Linux runs as a guest VM in Hyper-V and Hyper-V adds the virtual PCI
bus to the guest, Hyper-V always provides unique PCI domain.

commit 4a9b0933 ("PCI: hv: Use device serial number as PCI domain")
overrode unique domain with the serial number of the first device added to
the virtual PCI bus.

The reason for that patch was to have a consistent and short name for the
device, but Hyper-V doesn't provide unique serial numbers. Using non-unique
serial numbers as domain IDs leads to duplicate device addresses, which
causes PCI bus registration to fail.

commit 0c195567 ("netvsc: transparent VF management") avoids the need
for commit 4a9b0933 ("PCI: hv: Use device serial number as PCI
domain").  When scripts were used to configure VF devices, the name of
the VF needed to be consistent and short, but with commit 0c195567
("netvsc: transparent VF management") all the setup is done in the kernel,
and we do not need to maintain consistent name.

Revert commit 4a9b0933 ("PCI: hv: Use device serial number as PCI
domain") so we can reliably support multiple devices being assigned to
a guest.

Tag the patch for stable kernels containing commit 0c195567
("netvsc: transparent VF management").

Fixes: 4a9b0933 ("PCI: hv: Use device serial number as PCI domain")
Signed-off-by: NSridhar Pitchai <sridhar.pitchai@microsoft.com>
[lorenzo.pieralisi@arm.com: trimmed commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org # v4.14+
Reviewed-by: NBjorn Helgaas <bhelgaas@google.com>

29927dfb

17 3月, 2018 5 次提交

PCI: hv: Only queue new work items in hv_pci_devices_present() if necessary · 948373b3

由 Dexuan Cui 提交于 3月 15, 2018

If there is pending work in hv_pci_devices_present() we just need to add
the new dr entry into the dr_list. Add a check to detect pending work
items and update the code to skip queuing work if pending work items
are detected.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Acked-by: NHaiyang Zhang <haiyangz@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>

948373b3

PCI: hv: Remove the bogus test in hv_eject_device_work() · fca288c0

由 Dexuan Cui 提交于 3月 15, 2018

When kernel is executing hv_eject_device_work(), hpdev->state value must
be hv_pcichild_ejecting; any other value would consist in a bug,
therefore replace the bogus check with an explicit WARN_ON() on the
condition failure detection.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Acked-by: NHaiyang Zhang <haiyangz@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>

fca288c0

PCI: hv: Fix a comment typo in _hv_pcifront_read_config() · df3f2159

由 Dexuan Cui 提交于 3月 15, 2018

Comment in _hv_pcifront_read_config() contains a typo, fix it.

No functional change.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: changed commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: NHaiyang Zhang <haiyangz@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>

df3f2159

PCI: hv: Fix 2 hang issues in hv_compose_msi_msg() · de0aa7b2

由 Dexuan Cui 提交于 3月 15, 2018

1. With the patch "x86/vector/msi: Switch to global reservation mode",
the recent v4.15 and newer kernels always hang for 1-vCPU Hyper-V VM
with SR-IOV. This is because when we reach hv_compose_msi_msg() by
request_irq() -> request_threaded_irq() ->__setup_irq()->irq_startup()
-> __irq_startup() -> irq_domain_activate_irq() -> ... ->
msi_domain_activate() -> ... -> hv_compose_msi_msg(), local irq is
disabled in __setup_irq().

Note: when we reach hv_compose_msi_msg() by another code path:
pci_enable_msix_range() -> ... -> irq_domain_activate_irq() -> ... ->
hv_compose_msi_msg(), local irq is not disabled.

hv_compose_msi_msg() depends on an interrupt from the host.
With interrupts disabled, a UP VM always hangs in the busy loop in
the function, because the interrupt callback hv_pci_onchannelcallback()
can not be called.

We can do nothing but work it around by polling the channel. This
is ugly, but we don't have any other choice.

2. If the host is ejecting the VF device before we reach
hv_compose_msi_msg(), in a UP VM, we can hang in hv_compose_msi_msg()
forever, because at this time the host doesn't respond to the
CREATE_INTERRUPT request. This issue exists the first day the
pci-hyperv driver appears in the kernel.

Luckily, this can also by worked around by polling the channel
for the PCI_EJECT message and hpdev->state, and by checking the
PCI vendor ID.

Note: actually the above 2 issues also happen to a SMP VM, if
"hbus->hdev->channel->target_cpu == smp_processor_id()" is true.

Fixes: 4900be83 ("x86/vector/msi: Switch to global reservation mode")
Tested-by: NAdrian Suhov <v-adsuho@microsoft.com>
Tested-by: NChris Valean <v-chvale@microsoft.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Acked-by: NHaiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>

de0aa7b2

PCI: hv: Serialize the present and eject work items · 021ad274

由 Dexuan Cui 提交于 3月 15, 2018

When we hot-remove the device, we first receive a PCI_EJECT message and
then receive a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.

The first message is offloaded to hv_eject_device_work(), and the second
is offloaded to pci_devices_present_work(). Both the paths can be running
list_del(&hpdev->list_entry), causing general protection fault, because
system_wq can run them concurrently.

The patch eliminates the race condition.

Since access to present/eject work items is serialized, we do not need the
hbus->enum_sem anymore, so remove it.

Fixes: 4daace0d ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Link: https://lkml.kernel.org/r/KL1P15301MB00064DA6B4D221123B5241CFBFD70@KL1P15301MB0006.APCP153.PROD.OUTLOOK.COMTested-by: NAdrian Suhov <v-adsuho@microsoft.com>
Tested-by: NChris Valean <v-chvale@microsoft.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: squashed semaphore removal patch]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Acked-by: NHaiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>

021ad274

29 1月, 2018 1 次提交

PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate · 8cfab3cf

由 Bjorn Helgaas 提交于 1月 26, 2018

Add SPDX GPL-2.0 to all PCI files that specified the GPL version 2 license.

Remove the boilerplate GPL version 2 language, relying on the assertion in
b2441318 ("License cleanup: add SPDX GPL-2.0 license identifier to
files with no license") that the SPDX identifier may be used instead of the
full boilerplate text.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

8cfab3cf

29 12月, 2017 1 次提交

x86/apic: Switch all APICs to Fixed delivery mode · a31e58e1

由 Thomas Gleixner 提交于 12月 28, 2017

Some of the APIC incarnations are operating in lowest priority delivery
mode. This worked as long as the vector management code allocated the same
vector on all possible CPUs for each interrupt.

Lowest priority delivery mode does not necessarily respect the affinity
setting and may redirect to some other online CPU. This was documented
somewhere in the old code and the conversion to single target delivery
missed to update the delivery mode of the affected APIC drivers which
results in spurious interrupts on some of the affected CPU/Chipset
combinations.

Switch the APIC drivers over to Fixed delivery mode and remove all
leftovers of lowest priority delivery mode.

Switching to Fixed delivery mode is not a problem on these CPUs because the
kernel already uses Fixed delivery mode for IPIs. The reason for this is
that th SDM explicitely forbids lowest prio mode for IPIs. The reason is
obvious: If the irq routing does not honor destination targets in lowest
prio mode then an IPI targeted at CPU1 might end up on CPU0, which would be
a fatal problem in many cases.

As a consequence of this change, the apic::irq_delivery_mode field is now
pointless, but this needs to be cleaned up in a separate patch.

Fixes: fdba46ff ("x86/apic: Get rid of multi CPU affinity")
Reported-by: vcaputo@pengaru.com
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: vcaputo@pengaru.com
Cc: Pavel Machek <pavel@ucw.cz>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712281140440.1688@nanos

a31e58e1

08 11月, 2017 1 次提交

PCI: hv: Use effective affinity mask · 79aa801e

由 Dexuan Cui 提交于 11月 01, 2017

The effective_affinity_mask is always set when an interrupt is assigned in
__assign_irq_vector() -> apic->cpu_mask_to_apicid(), e.g. for struct apic
apic_physflat: -> default_cpu_mask_to_apicid() ->
irq_data_update_effective_affinity(), but it looks d->common->affinity
remains all-1's before the user space or the kernel changes it later.

In the early allocation/initialization phase of an IRQ, we should use the
effective_affinity_mask, otherwise Hyper-V may not deliver the interrupt to
the expected CPU.  Without the patch, if we assign 7 Mellanox ConnectX-3
VFs to a 32-vCPU VM, one of the VFs may fail to receive interrupts.
Tested-by: NAdrian Suhov <v-adsuho@microsoft.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NJake Oshins <jakeo@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Jork Loeser <jloeser@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>

79aa801e

10 8月, 2017 1 次提交

hyper-v: Globalize vp_index · 7415aea6

由 Vitaly Kuznetsov 提交于 8月 02, 2017

To support implementing remote TLB flushing on Hyper-V with a hypercall
we need to make vp_index available outside of vmbus module. Rename and
globalize.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: NStephen Hemminger <sthemmin@microsoft.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: devel@linuxdriverproject.org
Link: http://lkml.kernel.org/r/20170802160921.21791-7-vkuznets@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

7415aea6

04 8月, 2017 1 次提交

PCI: hv: Do not sleep in compose_msi_msg() · 80bfeeb9

由 Stephen Hemminger 提交于 7月 31, 2017

The setup of MSI with Hyper-V host was sleeping with locks held.  This
error is reported when doing SR-IOV hotplug with kernel built with lockdep:

    BUG: sleeping function called from invalid context at kernel/sched/completion.c:93
    in_atomic(): 1, irqs_disabled(): 1, pid: 1405, name: ip
    3 locks held by ip/1405:
   #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff976b10bb>] rtnetlink_rcv+0x1b/0x40
   #1:  (&desc->request_mutex){+.+...}, at: [<ffffffff970ddd33>] __setup_irq+0xb3/0x720
   #2:  (&irq_desc_lock_class){-.-...}, at: [<ffffffff970ddd65>] __setup_irq+0xe5/0x720
   irq event stamp: 3476
   hardirqs last  enabled at (3475): [<ffffffff971b3005>] get_page_from_freelist+0x225/0xc90
   hardirqs last disabled at (3476): [<ffffffff978024e7>] _raw_spin_lock_irqsave+0x27/0x90
   softirqs last  enabled at (2446): [<ffffffffc05ef0b0>] ixgbevf_configure+0x380/0x7c0 [ixgbevf]
   softirqs last disabled at (2444): [<ffffffffc05ef08d>] ixgbevf_configure+0x35d/0x7c0 [ixgbevf]

The workaround is to poll for host response instead of blocking on
completion.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

80bfeeb9

03 7月, 2017 5 次提交

PCI: hv: Use vPCI protocol version 1.2 · 7dcf90e9

由 Jork Loeser 提交于 5月 24, 2017

Update the Hyper-V vPCI driver to use the Server-2016 version of the vPCI
protocol, fixing MSI creation and retargeting issues.
Signed-off-by: NJork Loeser <jloeser@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

7dcf90e9

PCI: hv: Add vPCI version protocol negotiation · b1db7e7e

由 Jork Loeser 提交于 5月 24, 2017

Hyper-V vPCI offers different protocol versions.  Add the infra for
negotiating the one to use.
Signed-off-by: NJork Loeser <jloeser@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

b1db7e7e

PCI: hv: Temporary own CPU-number-to-vCPU-number infra · 02c3764c

由 Jork Loeser 提交于 5月 24, 2017

To ease parallel effort to centralize CPU-number-to-vCPU-number conversion,
temporarily stand up own version, file-local hv_tmp_cpu_nr_to_vp_nr().
Once the changes have merged, this work-around can be removed, and the
calls replaced with hv_cpu_number_to_vp_number().
Signed-off-by: NJork Loeser <jloeser@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

02c3764c

PCI: hv: Use page allocation for hbus structure · be66b673

由 Jork Loeser 提交于 5月 24, 2017

The hv_pcibus_device structure contains an in-memory hypercall argument
that must not cross a page boundary.  Allocate the structure as a page to
ensure that.
Signed-off-by: NJork Loeser <jloeser@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

be66b673

PCI: hv: Fix comment formatting and use proper integer fields · 691ac1dc

由 Jork Loeser 提交于 5月 24, 2017

Fix comment formatting and use proper integer fields.
Signed-off-by: NJork Loeser <jloeser@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

691ac1dc

18 4月, 2017 1 次提交

PCI: hv: Convert hv_pci_dev.refs from atomic_t to refcount_t · 24196f0c

由 Elena Reshetova 提交于 4月 18, 2017

refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.  This allows to avoid
accidental refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Reviewed-by: NStephen Hemminger <sthemmin@microsoft.com>

24196f0c

05 4月, 2017 2 次提交

PCI: hv: Allocate interrupt descriptors with GFP_ATOMIC · 59c58cee

由 K. Y. Srinivasan 提交于 3月 24, 2017

The memory allocation here needs to be non-blocking.  Fix the issue.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NLong Li <longli@microsoft.com>
Cc: <stable@vger.kernel.org>

59c58cee

PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUs · 433fcf6b

由 K. Y. Srinivasan 提交于 3月 24, 2017

When we have 32 or more CPUs in the affinity mask, we should use a special
constant to specify that to the host. Fix this issue.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NLong Li <longli@microsoft.com>
Cc: <stable@vger.kernel.org>

433fcf6b

24 3月, 2017 2 次提交

PCI: hv: Lock PCI bus on device eject · 414428c5

由 Long Li 提交于 3月 23, 2017

A PCI_EJECT message can arrive at the same time we are calling
pci_scan_child_bus() in the workqueue for the previous PCI_BUS_RELATIONS
message or in create_root_hv_pci_bus().  In this case we could potentially
modify the bus from multiple places.

Properly lock the bus access.

Thanks Dexuan Cui <decui@microsoft.com> for pointing out the race condition
in create_root_hv_pci_bus().
Reported-by: NXiaofeng Wang <xiaofwan@redhat.com>
Signed-off-by: NLong Li <longli@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

414428c5

PCI: hv: Properly handle PCI bus remove · d3a78d8b

由 Long Li 提交于 3月 23, 2017

hv_pci_devices_present() is called in hv_pci_remove() when we remove a PCI
device from the host, e.g., by disabling SR-IOV on a device.  In
hv_pci_remove(), the bus is already removed before the call, so we don't
need to rescan the bus in the workqueue scheduled from
hv_pci_devices_present().

By introducing bus state hv_pcibus_removed, we can avoid this situation.
Reported-by: NXiaofeng Wang <xiaofwan@redhat.com>
Signed-off-by: NLong Li <longli@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

d3a78d8b

18 2月, 2017 1 次提交

PCI: hv: Use device serial number as PCI domain · 4a9b0933

由 Haiyang Zhang 提交于 2月 13, 2017

Use the device serial number as the PCI domain. The serial numbers start
with 1 and are unique within a VM. So names, such as VF NIC names, that
include domain number as part of the name, can be shorter than that based
on part of bus UUID previously. The new names will also stay same for VMs
created with copied VHD and same number of devices.
Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>

4a9b0933

11 2月, 2017 1 次提交

PCI: hv: Fix wslot_to_devfn() to fix warnings on device removal · 60e2e2fb

由 Dexuan Cui 提交于 2月 10, 2017

The devfn of 00:02.0 is 0x10.  devfn_to_wslot(0x10) == 0x2, and
wslot_to_devfn(0x2) should be 0x10, while it's 0x2 in the current code.

Due to this, hv_eject_device_work() -> pci_get_domain_bus_and_slot()
returns NULL and pci_stop_and_remove_bus_device() is not called.

Later when the real device driver's .remove() is invoked by
hv_pci_remove() -> pci_stop_root_bus(), some warnings can be noticed
because the VM has lost the access to the underlying device at that
time.
Signed-off-by: NJake Oshins <jakeo@microsoft.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NHaiyang Zhang <haiyangz@microsoft.com>
CC: stable@vger.kernel.org
CC: K. Y. Srinivasan <kys@microsoft.com>
CC: Stephen Hemminger <sthemmin@microsoft.com>

60e2e2fb

30 11月, 2016 1 次提交

PCI: hv: Allocate physically contiguous hypercall params buffer · 0de8ce3e

由 Long Li 提交于 11月 08, 2016

hv_do_hypercall() assumes that we pass a segment from a physically
contiguous buffer.  A buffer allocated on the stack may not work if
CONFIG_VMAP_STACK=y is set.

Use kmalloc() to allocate this buffer.
Reported-by: NHaiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: NLong Li <longli@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

0de8ce3e

17 11月, 2016 3 次提交

PCI: hv: Delete the device earlier from hbus->children for hot-remove · e74d2ebd

由 Dexuan Cui 提交于 11月 10, 2016

After we send a PCI_EJECTION_COMPLETE message to the host, the host will
immediately send us a PCI_BUS_RELATIONS message with
relations->device_count == 0, so pci_devices_present_work(), running on
another thread, can find the being-ejected device, mark the
hpdev->reported_missing to true, and run list_move_tail()/list_del() for
the device -- this races hv_eject_device_work() -> list_del().

Move the list_del() in hv_eject_device_work() to an earlier place, i.e.,
before we send PCI_EJECTION_COMPLETE, so later the
pci_devices_present_work() can't see the device.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NJake Oshins <jakeo@microsoft.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>

e74d2ebd

PCI: hv: Fix hv_pci_remove() for hot-remove · 17978524

由 Dexuan Cui 提交于 11月 10, 2016

1. We don't really need such a big on-stack buffer when sending the
teardown_packet: vmbus_sendpacket() here only uses sizeof(struct
pci_message).

2. In the hot-remove case (PCI_EJECT), after we send PCI_EJECTION_COMPLETE
to the host, the host will send a RESCIND_CHANNEL message to us and the
host won't access the per-channel ringbuffer any longer, so we needn't send
PCI_RESOURCES_RELEASED/PCI_BUS_D0EXIT to the host, and we shouldn't expect
the host's completion message of PCI_BUS_D0EXIT, which will never come.

3. We should send PCI_BUS_D0EXIT after hv_send_resources_released().
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NJake Oshins <jakeo@microsoft.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>

17978524

PCI: hv: Use the correct buffer size in new_pcichild_device() · 8286e96d

由 Dexuan Cui 提交于 11月 10, 2016

We don't really need such a big on-stack buffer.  vmbus_sendpacket() here
only uses sizeof(struct pci_child_message).
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NJake Oshins <jakeo@microsoft.com>

8286e96d

01 11月, 2016 1 次提交

PCI: hv: Make unnecessarily global IRQ masking functions static · 542ccf45

由 Tobias Klauser 提交于 10月 31, 2016

Make hv_irq_mask() and hv_irq_unmask() static as they are only used in
pci-hyperv.c

This fixes a sparse warning.
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>

542ccf45

07 9月, 2016 1 次提交

PCI: hv: Handle hv_pci_generic_compl() error case · a5b45b7b

由 Dexuan Cui 提交于 8月 23, 2016

'completion_status' is used in some places, e.g.,
hv_pci_protocol_negotiation(), so we should make sure it's initialized in
error case too, though the error is unlikely here.

[bhelgaas: fix changelog typo and nearby whitespace]
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NKY Srinivasan <kys@microsoft.com>
CC: Jake Oshins <jakeo@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>

a5b45b7b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功