提交 · 4d043d3d2be1d4bd86d7efb5659d5b45b2572d84 · openanolis / cloud-kernel

08 5月, 2019 1 次提交

vfio/pci: use correct format characters · 4d043d3d

由 Louis Taylor 提交于 4月 03, 2019

[ Upstream commit 426b046b748d1f47e096e05bdcc6fb4172791307 ]

When compiling with -Wformat, clang emits the following warnings:

drivers/vfio/pci/vfio_pci.c:1601:5: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                ^~~~~~

drivers/vfio/pci/vfio_pci.c:1601:13: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                        ^~~~~~

drivers/vfio/pci/vfio_pci.c:1601:21: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1601:32: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                           ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1605:5: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                ^~~~~~

drivers/vfio/pci/vfio_pci.c:1605:13: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                        ^~~~~~

drivers/vfio/pci/vfio_pci.c:1605:21: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1605:32: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                           ^~~~~~~~~
The types of these arguments are unconditionally defined, so this patch
updates the format character to the correct ones for unsigned ints.

Link: https://github.com/ClangBuiltLinux/linux/issues/378Signed-off-by: NLouis Taylor <louis@kragniz.eu>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

4d043d3d

07 8月, 2018 2 次提交

vfio-pci: Disable binding to PFs with SR-IOV enabled · 0dd0e297

由 Alex Williamson 提交于 7月 12, 2018

We expect to receive PFs with SR-IOV disabled, however some host
drivers leave SR-IOV enabled at unbind. This puts us in a state where
we can potentially assign both the PF and the VF, leading to both
functionality as well as security concerns due to lack of managing the
SR-IOV state as well as vendor dependent isolation from the PF to VF.
If we were to attempt to actively disable SR-IOV on driver probe, we
risk VF bound drivers blocking, potentially risking live lock
scenarios. Therefore simply refuse to bind to PFs with SR-IOV enabled
with a warning message indicating the issue. Users can resolve this
by re-binding to the host driver and disabling SR-IOV before
attempting to use the device with vfio-pci.
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

0dd0e297

vfio: Mark expected switch fall-throughs · 544c05a6

由 Gustavo A. R. Silva 提交于 7月 09, 2018

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

544c05a6

20 7月, 2018 2 次提交

PCI: Rename pci_try_reset_bus() to pci_reset_bus() · c6a44ba9

由 Sinan Kaya 提交于 7月 19, 2018

Now that the old implementation of pci_reset_bus() is gone, replace
pci_try_reset_bus() with pci_reset_bus().

Compared to the old implementation, new code will fail immmediately with
-EAGAIN if object lock cannot be obtained.
Signed-off-by: NSinan Kaya <okaya@codeaurora.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

c6a44ba9

PCI: Unify try slot and bus reset API · 811c5cb3

由 Sinan Kaya 提交于 7月 19, 2018

Drivers are expected to call pci_try_reset_slot() or pci_try_reset_bus() by
querying if a system supports hotplug or not. A survey showed that most
drivers don't do this and we are leaking hotplug capability to the user.

Hide pci_try_slot_reset() from drivers and embed into pci_try_bus_reset().
Change pci_try_reset_bus() parameter from struct pci_bus to struct pci_dev.
Signed-off-by: NSinan Kaya <okaya@codeaurora.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

811c5cb3

19 7月, 2018 1 次提交

vfio/pci: Fix potential Spectre v1 · 0e714d27

由 Gustavo A. R. Silva 提交于 7月 17, 2018

info.index can be indirectly controlled by user-space, hence leading
to a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/vfio/pci/vfio_pci.c:734 vfio_pci_ioctl()
warn: potential spectre issue 'vdev->region'

Fix this by sanitizing info.index before indirectly using it to index
vdev->region

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Cc: stable@vger.kernel.org
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

0e714d27

19 6月, 2018 1 次提交

vfio/pci: Make IGD support a configurable option · 08ca1b52

由 Alex Williamson 提交于 6月 18, 2018

Allow the code which provides extensions to support direct assignment
of Intel IGD (GVT-d) to be compiled out of the kernel if desired.  The
config option for this was previously automatically enabled on X86,
therefore the default remains Y.  This simply provides the option to
disable it even for X86.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

08ca1b52

27 3月, 2018 3 次提交

vfio/pci: Add ioeventfd support · 30656177

由 Alex Williamson 提交于 3月 21, 2018

The ioeventfd here is actually irqfd handling of an ioeventfd such as
supported in KVM. A user is able to pre-program a device write to
occur when the eventfd triggers. This is yet another instance of
eventfd-irqfd triggering between KVM and vfio. The impetus for this
is high frequency writes to pages which are virtualized in QEMU.
Enabling this near-direct write path for selected registers within
the virtualized page can improve performance and reduce overhead.
Specifically this is initially targeted at NVIDIA graphics cards where
the driver issues a write to an MMIO register within a virtualized
region in order to allow the MSI interrupt to re-trigger.
Reviewed-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

30656177

vfio/pci: Use endian neutral helpers · 07fd7ef3

由 Alex Williamson 提交于 3月 21, 2018

The iowriteXX/ioreadXX functions assume little endian hardware and
convert to little endian on a write and from little endian on a read.
We currently do our own explicit conversion to negate this.  Instead,
add some endian dependent defines to avoid all byte swaps.  There
should be no functional change other than big endian systems aren't
penalized with wasted swaps.
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

07fd7ef3

vfio/pci: Pull BAR mapping setup from read-write path · 0d77ed35

由 Alex Williamson 提交于 3月 21, 2018

This creates a common helper that we'll use for ioeventfd setup.
Reviewed-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

0d77ed35

22 3月, 2018 1 次提交

Revert: "vfio-pci: Mask INTx if a device is not capabable of enabling it" · 834814e8

由 Alex Williamson 提交于 3月 21, 2018

This reverts commit 2170dd04

The intent of commit 2170dd04 ("vfio-pci: Mask INTx if a device is
not capabable of enabling it") was to disallow the user from seeing
that the device supports INTx if the platform is incapable of enabling
it. The detection of this case however incorrectly includes devices
which natively do not support INTx, such as SR-IOV VFs, and further
discussions reveal gaps even for the target use case.
Reported-by: NArjun Vynipadath <arjun@chelsio.com>
Fixes: 2170dd04 ("vfio-pci: Mask INTx if a device is not capabable of enabling it")
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

834814e8

21 12月, 2017 3 次提交

vfio-pci: Allow mapping MSIX BAR · a32295c6

由 Alexey Kardashevskiy 提交于 12月 13, 2017

By default VFIO disables mapping of MSIX BAR to the userspace as
the userspace may program it in a way allowing spurious interrupts;
instead the userspace uses the VFIO_DEVICE_SET_IRQS ioctl.
In order to eliminate guessing from the userspace about what is
mmapable, VFIO also advertises a sparse list of regions allowed to mmap.

This works fine as long as the system page size equals to the MSIX
alignment requirement which is 4KB. However with a bigger page size
the existing code prohibits mapping non-MSIX parts of a page with MSIX
structures so these parts have to be emulated via slow reads/writes on
a VFIO device fd. If these emulated bits are accessed often, this has
serious impact on performance.

This allows mmap of the entire BAR containing MSIX vector table.

This removes the sparse capability for PCI devices as it becomes useless.

As the userspace needs to know for sure whether mmapping of the MSIX
vector containing data can succeed, this adds a new capability -
VFIO_REGION_INFO_CAP_MSIX_MAPPABLE - which explicitly tells the userspace
that the entire BAR can be mmapped.

This does not touch the MSIX mangling in the BAR read/write handlers as
we are doing this just to enable direct access to non MSIX registers.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
[aw - fixup whitespace, trim function name]
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

a32295c6

vfio: Simplify capability helper · dda01f78

由 Alex Williamson 提交于 12月 12, 2017

The vfio_info_add_capability() helper requires the caller to pass a
capability ID, which it then uses to fill in header fields, assuming
hard coded versions.  This makes for an awkward and rigid interface.
The only thing we want this helper to do is allocate sufficient
space in the caps buffer and chain this capability into the list.
Reduce it to that simple task.
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: NKirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

dda01f78

vfio-pci: Mask INTx if a device is not capabable of enabling it · 2170dd04

由 Alexey Kardashevskiy 提交于 12月 07, 2017

At the moment VFIO rightfully assumes that INTx is supported if
the interrupt pin is not set to zero in the device config space.
However if that is not the case (the pin is not zero but pdev->irq is),
vfio_intx_enable() fails.

In order to prevent the userspace from trying to enable INTx when we know
that it cannot work, let's mask the PCI_INTERRUPT_PIN register.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

2170dd04

03 10月, 2017 2 次提交

vfio/pci: Virtualize Maximum Read Request Size · cf0d53ba

由 Alex Williamson 提交于 10月 02, 2017

MRRS defines the maximum read request size a device is allowed to
make.  Drivers will often increase this to allow more data transfer
with a single request.  Completions to this request are bound by the
MPS setting for the bus.  Aside from device quirks (none known), it
doesn't seem to make sense to set an MRRS value less than MPS, yet
this is a likely scenario given that user drivers do not have a
system-wide view of the PCI topology.  Virtualize MRRS such that the
user can set MRRS >= MPS, but use MPS as the floor value that we'll
write to hardware.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

cf0d53ba

vfio/pci: Virtualize Maximum Payload Size · 52318497

由 Alex Williamson 提交于 10月 02, 2017

With virtual PCI-Express chipsets, we now see userspace/guest drivers
trying to match the physical MPS setting to a virtual downstream port.
Of course a lone physical device surrounded by virtual interconnects
cannot make a correct decision for a proper MPS setting. Instead,
let's virtualize the MPS control register so that writes through to
hardware are disallowed. Userspace drivers like QEMU assume they can
write anything to the device and we'll filter out anything dangerous.
Since mismatched MPS can lead to AER and other faults, let's add it
to the kernel side rather than relying on userspace virtualization to
handle it.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>

52318497

28 7月, 2017 1 次提交

vfio/pci: Fix handling of RC integrated endpoint PCIe capability size · 796b7550

由 Alex Williamson 提交于 7月 27, 2017

Root complex integrated endpoints do not have a link and therefore may
use a smaller PCIe capability in config space than we expect when
building our config map.  Add a case for these to avoid reporting an
erroneous overlap.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

796b7550

27 7月, 2017 1 次提交

vfio/pci: Use pci_try_reset_function() on initial open · 9f478035

由 Alex Williamson 提交于 7月 26, 2017

Device lock bites again; if a device .remove() callback races a user
calling ioctl(VFIO_GROUP_GET_DEVICE_FD), the unbind request will hold
the device lock, but the user ioctl may have already taken a vfio_device
reference. In the case of a PCI device, the initial open will attempt
to reset the device, which again attempts to get the device lock,
resulting in deadlock. Use the trylock PCI reset interface and return
error on the open path if reset fails due to lock contention.

Link: https://lkml.org/lkml/2017/7/25/381Reported-by: NWen Congyang <wencongyang2@huawei.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

9f478035

13 6月, 2017 1 次提交

vfio/pci: Add Intel XXV710 to hidden INTx devices · 7d57e5e9

由 Alex Williamson 提交于 6月 13, 2017

XXV710 has the same broken INTx behavior as the rest of the X/XL710
series, the interrupt status register is not wired to report pending
INTx interrupts, thus we never associate the interrupt to the device.
Extend the device IDs to include these so that we hide that the
device supports INTx at all to the user.
Reported-by: NStefan Assmann <sassmann@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NJesse Brandeburg <jesse.brandeburg@intel.com>

7d57e5e9

04 1月, 2017 1 次提交

vfio-pci: Handle error from pci_iomap · e19f32da

由 Arvind Yadav 提交于 1月 03, 2017

Here, pci_iomap can fail, handle this case release selected
pci regions and return -ENOMEM.
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

e19f32da

30 12月, 2016 1 次提交

vfio-pci: use 32-bit comparisons for register address for gcc-4.5 · 45e86971

由 Arnd Bergmann 提交于 12月 30, 2016

Using ancient compilers (gcc-4.5 or older) on ARM, we get a link
failure with the vfio-pci driver:

ERROR: "__aeabi_lcmp" [drivers/vfio/pci/vfio-pci.ko] undefined!

The reason is that the compiler tries to do a comparison of
a 64-bit range. This changes it to convert to a 32-bit number
explicitly first, as newer compilers do for themselves.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

45e86971

13 12月, 2016 1 次提交

PCI: Move config space size macros to pci_regs.h · cc10385b

由 Wang Sheng-Hui 提交于 9月 22, 2016

Move PCI configuration space size macros (PCI_CFG_SPACE_SIZE and
PCI_CFG_SPACE_EXP_SIZE) from drivers/pci/pci.h to
include/uapi/linux/pci_regs.h so they can be used by more drivers and
eliminate duplicate definitions.

[bhelgaas: Expand comment to include PCI-X details]
Signed-off-by: NWang Sheng-Hui <shhuiw@foxmail.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

cc10385b

19 11月, 2016 1 次提交

vfio/pci: Drop unnecessary pcibios_err_to_errno() · f4cb4100

由 Cao jin 提交于 11月 18, 2016

As of commit d97ffe23 ("PCI: Fix return value from
pci_user_{read,write}_config_*()") it's unnecessary to call
pcibios_err_to_errno() to fixup the return value from these functions.

pcibios_err_to_errno() already does simple passthrough of -errno values,
therefore no functional change is expected.

[aw: changelog]
Signed-off-by: NCao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

f4cb4100

17 11月, 2016 2 次提交

vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare() · ef198aaa

由 Kirti Wankhede 提交于 11月 17, 2016

Updated vfio_pci.c file to use vfio_set_irqs_validate_and_prepare()
Signed-off-by: NKirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: NNeo Jia <cjia@nvidia.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

ef198aaa

vfio_pci: Update vfio_pci to use vfio_info_add_capability() · c535d345

由 Kirti Wankhede 提交于 11月 17, 2016

Update msix_sparse_mmap_cap() to use vfio_info_add_capability()
Update region type capability to use vfio_info_add_capability()
Signed-off-by: NKirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: NNeo Jia <cjia@nvidia.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

c535d345

27 10月, 2016 1 次提交

vfio/pci: Fix integer overflows, bitmask check · 05692d70

由 Vlad Tsyrklevich 提交于 10月 12, 2016

The VFIO_DEVICE_SET_IRQS ioctl did not sufficiently sanitize
user-supplied integers, potentially allowing memory corruption. This
patch adds appropriate integer overflow checks, checks the range bounds
for VFIO_IRQ_SET_DATA_NONE, and also verifies that only single element
in the VFIO_IRQ_SET_DATA_TYPE_MASK bitmask is set.
VFIO_IRQ_SET_ACTION_TYPE_MASK is already correctly checked later in
vfio_pci_set_irqs_ioctl().

Furthermore, a kzalloc is changed to a kcalloc because the use of a
kzalloc with an integer multiplication allowed an integer overflow
condition to be reached without this patch. kcalloc checks for overflow
and should prevent a similar occurrence.
Signed-off-by: NVlad Tsyrklevich <vlad@tsyrklevich.net>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

05692d70

30 9月, 2016 1 次提交

vfio_pci: use pci_alloc_irq_vectors · 61771468

由 Christoph Hellwig 提交于 9月 11, 2016

Simplify the interrupt setup by using the new PCI layer helpers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

61771468

27 9月, 2016 2 次提交

vfio-pci: Disable INTx after MSI/X teardown · c93a97ee

由 Alex Williamson 提交于 9月 26, 2016

The MSI/X shutdown path can gratuitously enable INTx, which is not
something we want to happen if we're dealing with broken INTx device.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

c93a97ee

vfio-pci: Virtualize PCIe & AF FLR · ddf9dc0e

由 Alex Williamson 提交于 9月 26, 2016

We use a BAR restore trick to try to detect when a user has performed
a device reset, possibly through FLR or other backdoors, to put things
back into a working state. This is important for backdoor resets, but
we can actually just virtualize the "front door" resets provided via
PCIe and AF FLR. Set these bits as virtualized + writable, allowing
the default write to set them in vconfig, then we can simply check the
bit, perform an FLR of our own, and clear the bit. We don't actually
have the granularity in PCI to specify the type of reset we want to
do, but generally devices don't implement both PCIe and AF FLR and
we'll favor these over other types of reset, so we should generally
lineup. We do test whether the device provides the requested FLR type
to stay consistent with hardware capabilities though.

This seems to fix several instance of devices getting into bad states
with userspace drivers, like dpdk, running inside a VM.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Reviewed-by: NGreg Rose <grose@lightfleet.com>

ddf9dc0e

30 8月, 2016 1 次提交

vfio/pci: Fix typos in comments · 8138dabb

由 Wei Jiangang 提交于 8月 17, 2016

Signed-off-by: NWei Jiangang <weijg.fnst@cn.fujitsu.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

8138dabb

09 8月, 2016 1 次提交

vfio/pci: Fix NULL pointer oops in error interrupt setup handling · c8952a70

由 Alex Williamson 提交于 8月 08, 2016

There are multiple cases in vfio_pci_set_ctx_trigger_single() where
we assume we can safely read from our data pointer without actually
checking whether the user has passed any data via the count field.
VFIO_IRQ_SET_DATA_NONE in particular is entirely broken since we
attempt to pull an int32_t file descriptor out before even checking
the data type.  The other data types assume the data pointer contains
one element of their type as well.

In part this is good news because we were previously restricted from
doing much sanitization of parameters because it was missed in the
past and we didn't want to break existing users.  Clearly DATA_NONE
is completely broken, so it must not have any users and we can fix
it up completely.  For DATA_BOOL and DATA_EVENTFD, we'll just
protect ourselves, returning error when count is zero since we
previously would have oopsed.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Reported-by: NChris Thompson <the_cartographer@hotmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: NEric Auger <eric.auger@redhat.com>

c8952a70

09 7月, 2016 1 次提交

vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive · 05f0c03f

由 Yongji Xie 提交于 6月 30, 2016

Current vfio-pci implementation disallows to mmap
sub-page(size < PAGE_SIZE) MMIO BARs because these BARs' mmio
page may be shared with other BARs. This will cause some
performance issues when we passthrough a PCI device with
this kind of BARs. Guest will be not able to handle the mmio
accesses to the BARs which leads to mmio emulations in host.

However, not all sub-page BARs will share page with other BARs.
We should allow to mmap the sub-page MMIO BARs which we can
make sure will not share page with other BARs.

This patch adds support for this case. And we try to add a
dummy resource to reserve the remainder of the page which
hot-add device's BAR might be assigned into. But it's not
necessary to handle the case when the BAR is not page aligned.
Because we can't expect the BAR will be assigned into the same
location in a page in guest when we passthrough the BAR. And
it's hard to access this BAR in userspace because we have
no way to get the BAR's location in a page.
Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

05f0c03f

01 6月, 2016 1 次提交

vfio/pci: Allow VPD short read · ce7585f3

由 Alex Williamson 提交于 5月 31, 2016

The size of the VPD area is not necessarily 4-byte aligned, so a
pci_vpd_read() might return less than 4 bytes. Zero our buffer and
accept anything other than an error. Intel X710 NICs exercise this.

Fixes: 4e1a6355 ("vfio/pci: Use kernel VPD access functions")
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

ce7585f3

30 5月, 2016 1 次提交

vfio/pci: Fix ordering of eventfd vs virqfd shutdown · 956b56a9

由 Alex Williamson 提交于 5月 30, 2016

Both the INTx and MSI/X disable paths do an eventfd_ctx_put() for the
trigger eventfd before calling vfio_virqfd_disable() any potential
mask and unmask eventfds.  This opens a use-after-free race where an
inopportune irqfd can reference the freed signalling eventfd.  Reorder
to avoid this possibility.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

956b56a9

20 5月, 2016 1 次提交

vfio_pci: Test for extended capabilities if config space > 256 bytes · f7055280

由 Alexey Kardashevskiy 提交于 4月 29, 2016

PCI-Express spec says that reading 4 bytes at offset 100h should return
zero if there is no extended capability so VFIO reads this dword to
know if there are extended capabilities.

However it is not always possible to access the extended space so
generic PCI code in pci_cfg_space_size_ext() checks if
pci_read_config_dword() can read beyond 100h and if the check fails,
it sets the config space size to 100h.

VFIO does its own extended capabilities check by reading at offset 100h
which may produce 0xffffffff which VFIO treats as the extended config
space presense and calls vfio_ecap_init() which fails to parse
capabilities (which is expected) but right before the exit, it writes
zero at offset 100h which is beyond the buffer allocated for
vdev->vconfig (which is 256 bytes) which leads to random memory
corruption.

This makes VFIO only check for the extended capabilities if
the discovered config size is more than 256 bytes.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

f7055280

29 4月, 2016 2 次提交

vfio/pci: Add test for BAR restore · dc928109

由 Alex Williamson 提交于 3月 24, 2016

If a device is reset without the memory or i/o bits enabled in the
command register we may not detect it, potentially leaving the device
without valid BAR programming. Add an additional test to check the
BARs on each write to the command register.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

dc928109

vfio/pci: Hide broken INTx support from user · 45074405

由 Alex Williamson 提交于 3月 24, 2016

INTx masking has two components, the first is that we need the ability
to prevent the device from continuing to assert INTx. This is
provided via the DisINTx bit in the command register and is the only
thing we can really probe for when testing if INTx masking is
supported. The second component is that the device needs to indicate
if INTx is asserted via the interrupt status bit in the device status
register. With these two features we can generically determine if one
of the devices we own is asserting INTx, signal the user, and mask the
interrupt while the user services the device.

Generally if one or both of these components is broken we resort to
APIC level interrupt masking, which requires an exclusive interrupt
since we have no way to determine the source of the interrupt in a
shared configuration. This often makes it difficult or impossible to
configure the system for userspace use of the device, for an interrupt
mode that the user may not need.

One possible configuration of broken INTx masking is that the DisINTx
support is fully functional, but the interrupt status bit never
signals interrupt assertion. In this case we do have the ability to
prevent the device from asserting INTx, but lack the ability to
identify the interrupt source. For this case we can simply pretend
that the device lacks INTx support entirely, keeping DisINTx set on
the physical device, virtualizing this bit for the user, and
virtualizing the interrupt pin register to indicate no INTx support.
We already support virtualization of the DisINTx bit and already
virtualize the interrupt pin for platforms without INTx support. By
tying these components together, setting DisINTx on open and reset,
and identifying devices broken in this particular way, we can provide
support for them w/o the handicap of APIC level INTx masking.

Intel i40e (XL710/X710) 10/20/40GbE NICs have been identified as being
broken in this specific way. We leave the vfio-pci.nointxmask option
as a mechanism to bypass this support, enabling INTx on the device
with all the requirements of APIC level masking.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>

45074405

28 2月, 2016 1 次提交

vfio: fix ioctl error handling · 8160c4e4

由 Michael S. Tsirkin 提交于 2月 28, 2016

Calling return copy_to_user(...) in an ioctl will not
do the right thing if there's a pagefault:
copy_to_user returns the number of bytes not copied
in this case.

Fix up vfio to do
	return copy_to_user(...)) ?
		-EFAULT : 0;

everywhere.

Cc: stable@vger.kernel.org
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

8160c4e4

26 2月, 2016 1 次提交

vfio/pci: return -EFAULT if copy_to_user fails · c4aec310

由 Dan Carpenter 提交于 2月 25, 2016

The copy_to_user() function returns the number of bytes that were not
copied but we want to return -EFAULT on error here.

Fixes: 188ad9d6 ('vfio/pci: Include sparse mmap capability for MSI-X table regions')
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

c4aec310

23 2月, 2016 1 次提交

vfio/pci: Expose shadow ROM as PCI option ROM · a13b6459

由 Alex Williamson 提交于 2月 22, 2016

Integrated graphics may have their ROM shadowed at 0xc0000 rather than
implement a PCI option ROM.  Make this ROM appear to the user using
the ROM BAR.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

a13b6459

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功