提交 · 8a6b1bc70dbb538cb8a39e8c5be9c3dfd7b1f40e · openanolis / cloud-kernel

20 6月, 2013 40 次提交

powerpc/eeh: EEH core to handle special event · 8a6b1bc7

由 Gavin Shan 提交于 6月 20, 2013

On PowerNV platform, the EEH event caused by interrupt won't have
binding PE. The patch enables EEH core to handle the special event.
To avoid the current logic we have, The eeh_handle_event() is renamed
to eeh_handle_normal_event(), and the eeh_handle_special_event() is
introduced. The function eeh_handle_event() dispatches to above two
functions according to the input parameter. Besides, new backend
"next_error" added to eeh_ops and it's expected to have following
return values:

        4 - Dead IOC           3 - Dead PHB
        2 - Fenced PHB         1 - Frozen PE
        0 - No error found
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8a6b1bc7

powerpc/eeh: Export confirm_error_lock · 4907581d

由 Gavin Shan 提交于 6月 20, 2013

An EEH event is created and queued to the event queue for each
ingress EEH error. When there're mutiple EEH errors, we need serialize
the process to keep consistent PE state (flags). The spinlock
"confirm_error_lock" was introduced for the purpose. We'll inject
EEH event upon error reporting interrupts on PowerNV platform. So
we export the spinlock for that to use for consistent PE state.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4907581d

powerpc/eeh: Allow to purge EEH events · 99866595

由 Gavin Shan 提交于 6月 20, 2013

On PowerNV platform, we might run into the situation where subsequent
events are duplicated events of former one, which is being processed.
For the case, we need the function implemented by the patch to purge
EEH events accordingly.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

99866595

powerpc/eeh: Trace time on first error for PE · 5a71978e

由 Gavin Shan 提交于 6月 20, 2013

We're not expecting that one specific PE got frozen for over 5
times in last hour. Otherwise, the PE will be removed from the
system upon newly coming EEH errors. The patch introduces time
stamp to trace the first error on specific PE in last hour and
function to update that accordingly. Besides, the time stamp
is recovered during PE hotplug path as we did for frozen count.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5a71978e

powerpc/eeh: Single kthread to handle events · c8608558

由 Gavin Shan 提交于 6月 20, 2013

We possiblly have multiple kthreads running for multiple EEH errors
(events) and use one spinlock to make the process of handling those
EEH events serialized. That's unnecessary and the patch creates only
one kthread, which is started during EEH core initialization time in
eeh_init(). A new semaphore introduced to count the number of existing
EEH events in the queue and the kthread waiting on the semaphore.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c8608558

powerpc/eeh: Delay EEH probe during hotplug · 26a74850

由 Gavin Shan 提交于 6月 20, 2013

While doing EEH recovery, the PCI devices of the problematic PE
should be removed and then added to the system again. During the
so-called hotplug event, the PCI devices of the problematic PE
will be probed through early/late phase. We would delay EEH probe
on late point for PowerNV platform since the PCI device isn't
available in early phase.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

26a74850

powerpc/eeh: Refactor eeh_reset_pe_once() · 326a98ea

由 Gavin Shan 提交于 6月 20, 2013

We shouldn't check that the returned PE status is exactly equal to
(EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE) but instead only check
that they are both set.

[benh: changelog]
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

326a98ea

powerpc/eeh: EEH post initialization operation · 21fd21f5

由 Gavin Shan 提交于 6月 20, 2013

The patch adds new EEH operation post_init. It's used to notify
the platform that EEH core has completed the EEH probe. By that,
PowerNV platform starts to use the services supplied by EEH
functionality.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

21fd21f5

powerpc/eeh: Make eeh_init() public · 51fb5f56

由 Gavin Shan 提交于 6月 20, 2013

For EEH on PowerNV platform, we will do EEH probe based on the
real PCI devices. The PCI devices are available after PCI probe.
So we have to call eeh_init() explicitly on PowerNV platform
after PCI probe. The patch also does EEH probe for PowerNV platform
in eeh_init().
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

51fb5f56

powerpc/eeh: Trace PCI bus from PE · 8cdb2833

由 Gavin Shan 提交于 6月 20, 2013

There're several types of PEs can be supported for now: PHB, Bus
and Device dependent PE. For PCI bus dependent PE, tracing the
corresponding PCI bus from PE (struct eeh_pe) would make the code
more efficient. The patch also enables the retrieval of PCI bus based
on the PCI bus dependent PE.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8cdb2833

powerpc/eeh: Make eeh_pe_get() public · 01566808

由 Gavin Shan 提交于 6月 20, 2013

While processing EEH event interrupt from P7IOC, we need function
to retrieve the PE according to the indicated EEH device. The patch
makes function eeh_pe_get() public so that other source files can call
it for that purpose. Also, the patch fixes referring to wrong BDF
(Bus/Device/Function) address while searching PE in function
__eeh_pe_get().
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

01566808

powerpc/eeh: Make eeh_phb_pe_get() public · 9ff67433

由 Gavin Shan 提交于 6月 20, 2013

One of the possible cases indicated by P7IOC interrupt is fenced
PHB. For that case, we need fetch the PE corresponding to the PHB
and disable the PHB and all subordinate PCI buses/devices, recover
from the fenced state and eventually enable the whole PHB. We need
one function to fetch the PHB PE outside eeh_pe.c and the patch is
going to make eeh_phb_pe_get() public for that purpose.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9ff67433

powerpc/eeh: Move common part to kernel directory · 317f06de

由 Gavin Shan 提交于 6月 20, 2013

The patch moves the common part of EEH core into arch/powerpc/kernel
directory so that we needn't PPC_PSERIES while compiling POWERNV
platform:

        * Move the EEH common part into arch/powerpc/kernel
        * Move the functions for PCI hotplug from pSeries platform to
          arch/powerpc/kernel/pci-hotplug.c
        * Move CONFIG_EEH from arch/powerpc/platforms/pseries/Kconfig to
          arch/powerpc/platforms/Kconfig
        * Adjust makefile accordingly
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

317f06de

powerpc/eeh: Cleanup for EEH core · a84f273c

由 Gavin Shan 提交于 6月 20, 2013

Cleanup on EEH core to remove unnecessary whitespaces.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a84f273c

powerpc/tm: Fix return of active 64bit signals · 87b4e539

由 Michael Neuling 提交于 6月 09, 2013

Currently we only restore signals which are transactionally suspended but it's
possible that the transaction can be restored even when it's active. Most
likely this will result in a transactional rollback by the hardware as the
transaction will have been doomed by an earlier treclaim.

The current code is a legacy of earlier kernel implementations which did
software rollback of active transactions in the kernel. That code has now gone
but we didn't correctly fix up this part of the signals code which still makes
assumptions based on having software rollback.

This changes the signal return code to always restore both contexts on 64 bit
signal return. It also ensures that the MSR TM bits are properly restored from
the signal context which they are not currently.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

87b4e539

powerpc/tm: Fix return of 32bit rt signals to active transactions · 55e43418

由 Michael Neuling 提交于 6月 09, 2013

This changes the signal return code to always restore both contexts on 32 bit
rt signal return.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

55e43418

powerpc/tm: Fix restoration of MSR on 32bit signal return · 2c27a18f

由 Michael Neuling 提交于 6月 09, 2013

Currently we clear out the MSR TM bits on signal return assuming that the
signal should never return to an active transaction.

This is bogus as the user may do this. It's most likely the transaction will
be doomed due to a treclaim but that's a problem for the HW not the kernel.

This pulls out both MSR TM bits from the user supplied context rather than just
setting TM suspend. We pull out only the bits needed to ensure the user can't
do anything dangerous to the MSR.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2c27a18f

powerpc/tm: Fix 32 bit non-rt signals · fee55450

由 Michael Neuling 提交于 6月 09, 2013

Currently sys_sigreturn() is TM unaware.  Therefore, if we take a 32 bit signal
without SIGINFO (non RT) inside a transaction, on signal return we don't
restore the signal frame correctly.

This checks if the signal frame being restoring is an active transaction, and
if so, it copies the additional state to ptregs so it can be restored.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fee55450

powerpc/tm: Fix writing top half of MSR on 32 bit signals · 1d25f11f

由 Michael Neuling 提交于 6月 09, 2013

The MSR TM controls are in the top 32 bits of the MSR hence on 32 bit signals,
we stick the top half of the MSR in the checkpointed signal context so that the
user can access it.

Unfortunately, we don't currently write anything to the checkpointed signal
context when coming in a from a non transactional process and hence the top MSR
bits can contain junk.

This updates the 32 bit signal handling code to always write something to the
top MSR bits so that users know if the process is transactional or not and the
kernel can use it on signal return.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1d25f11f

powerpc/8xx: Remove 8xx specific "minimal FPU emulation" · 968219fa

由 Benjamin Herrenschmidt 提交于 6月 09, 2013

This is duplicated code from math-emu and implements such a small
subset of the FPU (load/stores/fmr) that it's essentially pointless
nowdays.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

968219fa

powerpc/math-emu: Allow math-emu to be used for HW FPU · 4e63f8ed

由 Benjamin Herrenschmidt 提交于 6月 09, 2013

(Including 64-bit ones)

This allow SW emulation by the kernel of optional instructions
such as fsqrt which aren't implemented on some processors, and
thus fixes some Fedora 19 issues such as Anaconda since the
compiler is set to generate those by default on 64-bit.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4e63f8ed

powerpc/math-emu: Fix decoding of some instructions · 04ae9001

由 Benjamin Herrenschmidt 提交于 6月 09, 2013

The decoding of some instructions such as fsqrt{s} was incorrect,
using the wrong registers, and thus could not work.

This fixes it and also adds a couple of place holders for missing
instructions.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

04ae9001

powerpc/pseries: Read common partition via pstore · a5e4797b

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

This patch exploits pstore subsystem to read details of common partition
in NVRAM to a separate file in /dev/pstore. For instance, common partition
details will be stored in a file named [common-nvram-6].
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a5e4797b

powerpc/pseries: Read of-config partition via pstore · f33f748c

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

This patch set exploits the pstore subsystem to read details of
of-config partition in NVRAM to a separate file in /dev/pstore.
For instance, of-config partition details will be stored in a
file named [of-nvram-5].
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f33f748c

powerpc/pseries: Distinguish between a os-partition and non-os partition · edf38465

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

Introduce os_partition member in nvram_os_partition structure to identify
if the partition is an os partition or not. This will be useful to handle
non-os partitions of-config and common.
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

edf38465

powerpc/pseries: Read rtas partition via pstore · 69020eea

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

This patch set exploits the pstore subsystem to read details of rtas partition
in NVRAM to a separate file in /dev/pstore. For instance, rtas details will be
stored in a file named [rtas-nvram-4].
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

69020eea

powerpc/pseries: Read/Write oops nvram partition via pstore · d7563c94

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

IBM's p series machines provide persistent storage for LPARs through NVRAM.
NVRAM's lnx,oops-log partition is used to log oops messages.
Currently the kernel provides the contents of p-series NVRAM only as a
simple stream of bytes via /dev/nvram, which must be interpreted in user
space by the nvram command in the powerpc-utils package.

This patch set exploits the pstore subsystem to expose oops partition in
NVRAM as a separate file in /dev/pstore. For instance, Oops messages will be
stored in a file named [dmesg-nvram-2]. In case pstore registration fails it
will fall back to kmsg_dump mechanism.

This patch will read/write the oops messages from/to this partition via pstore.
Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d7563c94

powerpc/pseries: Introduce generic read function to read nvram-partitions · 12674610

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

Introduce generic read function to read nvram partitions other than rtas.
nvram_read_error_log will be retained which is used to read rtas partition
from rtasd. nvram_read_partition is the generic read function to read from
any nvram partition.
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

12674610

powerpc/pseries: Add version and timestamp to oops header · b1f70e1f

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

Introduce version and timestamp information in the oops header.
oops_log_info (oops header) holds version (to distinguish between old
and new format oops header), length of the oops text
(compressed or uncompressed) and timestamp.

The version field will sit in the same place as the length in old
headers. version is assigned 5000 (greater than oops partition size)
so that existing tools will refuse to dump new style partitions as
the length is too large. The updated tools will work with both
old and new format headers.
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b1f70e1f

powerpc/pseries: Remove syslog prefix in uncompressed oops text · 1bf247f8

由 Aruna Balakrishnaiah 提交于 6月 06, 2013

Removal of syslog prefix in the uncompressed oops text will
help in capturing more oops data.
Signed-off-by: NAruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: NJim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1bf247f8

powerpc/eeh: Enhance converting EEH dev · 2d5c1216

由 Gavin Shan 提交于 6月 05, 2013

Under some special circumstances, the EEH device doesn't have the
associated device tree node or PCI device. The patch enhances those
functions converting EEH device to device tree node or PCI device
accordingly to avoid unnecessary system crash.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2d5c1216

powerpc/eeh: Fix fetching bus for single-dev-PE · 5fb62169

由 Gavin Shan 提交于 6月 05, 2013

While running Linux as guest on top of phyp, we possiblly have
PE that includes single PCI device. However, we didn't return
its PCI bus correctly and it leads to failure on recovery from
EEH errors for single-dev-PE. The patch fixes the issue.

Cc: <stable@vger.kernel.org> # v3.7+
Cc: Steve Best <sbest@us.ibm.com>
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5fb62169

powerpc: Align thread->fpr to 16 bytes · 475e68cf

由 Anton Blanchard 提交于 6月 05, 2013

On newer CPUs we use VSX loads and stores to the thread->fpr array.
For best performance we need to ensure 16 byte alignment.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

475e68cf

powerpc/pseries: Use 'true' instead of '1' for orderly_poweroff · 1b7e0cbe

由 liguang 提交于 5月 30, 2013

orderly_poweroff is expecting a bool parameter, so
use 'true' instead '1'
Signed-off-by: Nliguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1b7e0cbe

powerpc/smp: Use '==' instead of '<' for system_state · a5b45ded

由 liguang 提交于 5月 30, 2013

'system_state < SYSTEM_RUNNING' will have same effect
with 'system_state == SYSTEM_BOOTING', but the later
one is more clearer.
Signed-off-by: Nliguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a5b45ded

powerpc: Restore dbcr0 on user space exit · 13d543cd

由 Bharat Bhushan 提交于 5月 22, 2013

On BookE (Branch taken + Single Step) is as same as Branch Taken
on BookS and in Linux we simulate BookS behavior for BookE as well.
When doing so, in Branch taken handling we want to set DBCR0_IC but
we update the current->thread->dbcr0 and not DBCR0.

Now on 64bit the current->thread.dbcr0 (and other debug registers)
is synchronized ONLY on context switch flow. But after handling
Branch taken in debug exception if we return back to user space
without context switch then single stepping change (DBCR0_ICMP)
does not get written in h/w DBCR0 and Instruction Complete exception
does not happen.

This fixes using ptrace reliably on BookE-PowerPC

lmbench latency test (lat_syscall) Results are (they varies a little
on each run)

1) ./lat_syscall <action> /dev/shm/uImage

action:	Open	read	write	stat	fstat	null
Before:	3.8618	0.2017	0.2851	1.6789	0.2256	0.0856
After:	3.8580	0.2017	0.2851	1.6955	0.2255	0.0856

1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage
action:	Open	read	write	stat	fstat	null
Before:	4.1388	0.2238	0.3066	1.7106	0.2256	0.0856
After:	4.1413	0.2236	0.3062	1.7107	0.2256	0.0856

[ Slightly modified to avoid extra branch in the fast path
  on Book3S and fix build on all non-BookE 64-bit -- BenH
]
Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

13d543cd

powerpc: Debug control and status registers are 32bit · d8899bb2

由 Bharat Bhushan 提交于 5月 22, 2013

Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d8899bb2

powerpc/vfio: Enable on pSeries platform · 5b25199e

由 Alexey Kardashevskiy 提交于 5月 21, 2013

The enables VFIO on the pSeries platform, enabling user space
programs to access PCI devices directly.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5b25199e

powerpc/vfio: Implement IOMMU driver for VFIO · 5ffd229c

由 Alexey Kardashevskiy 提交于 5月 21, 2013

VFIO implements platform independent stuff such as
a PCI driver, BAR access (via read/write on a file descriptor
or direct mapping when possible) and IRQ signaling.

The platform dependent part includes IOMMU initialization
and handling.  This implements an IOMMU driver for VFIO
which does mapping/unmapping pages for the guest IO and
provides information about DMA window (required by a POWER
guest).

Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5ffd229c

powerpc/vfio: Enable on PowerNV platform · 4e13c1ac

由 Alexey Kardashevskiy 提交于 5月 21, 2013

This initializes IOMMU groups based on the IOMMU configuration
discovered during the PCI scan on POWERNV (POWER non virtualized)
platform.  The IOMMU groups are to be used later by the VFIO driver,
which is used for PCI pass through.

It also implements an API for mapping/unmapping pages for
guest PCI drivers and providing DMA window properties.
This API is going to be used later by QEMU-VFIO to handle
h_put_tce hypercalls from the KVM guest.

The iommu_put_tce_user_mode() does only a single page mapping
as an API for adding many mappings at once is going to be
added later.

Although this driver has been tested only on the POWERNV
platform, it should work on any platform which supports
TCE tables.  As h_put_tce hypercall is received by the host
kernel and processed by the QEMU (what involves calling
the host kernel again), performance is not the best -
circa 220MB/s on 10Gb ethernet network.

To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
option and configure VFIO as required.

Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4e13c1ac

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功