提交 · f689b742f217b2ffe7925f8a6521b208ee995309 · openanolis / cloud-kernel

05 1月, 2016 1 次提交

cxl: Fix DSI misses when the context owning task exits · 7b8ad495

由 Vaibhav Jain 提交于 11月 24, 2015

Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we
store the pid of the current task_struct and use it to get pointer to
the mm_struct of the process, while processing page or segment faults
from the capi card. However this causes issues when the thread that had
originally issued the start-work ioctl exits in which case the stored
pid is no more valid and the cxl driver is unable to handle faults as
the mm_struct corresponding to process is no more accessible.

This patch fixes this issue by using the mm_struct of the next alive
task in the thread group. This is done by iterating over all the tasks
in the thread group starting from thread group leader and calling
get_task_mm on each one of them. When a valid mm_struct is obtained the
pid of the associated task is stored in the context replacing the
exiting one for handling future faults.

The patch introduces a new function named get_mem_context that checks if
the current task pointed to by ctx->pid is dead? If yes it performs the
steps described above. Also a new variable cxl_context.glpid is
introduced which stores the pid of the thread group leader associated
with the context owning task.
Reported-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reported-by: NFrank Haverkamp <HAVERKAM@de.ibm.com>
Suggested-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7b8ad495

24 11月, 2015 1 次提交

cxl: Fix possible idr warning when contexts are released · 1b5df59e

由 Vaibhav Jain 提交于 11月 16, 2015

An idr warning is reported when a context is release after the capi card
is unbound from the cxl driver via sysfs. Below are the steps to
reproduce:

1. Create multiple afu contexts in an user-space application using libcxl.
2. Unbind capi card from cxl using command of form
   echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind
3. Exit/kill the application owning afu contexts.

After above steps a warning message is usually seen in the kernel logs
of the form "idr_remove called for id=<context-id> which is not
allocated."

This is caused by the function cxl_release_afu which destroys the
contexts_idr table. So when a context is release no entry for context pe
is found in the contexts_idr table and idr code prints this warning.

This patch fixes this issue by increasing & decreasing the ref-count on
the afu device when a context is initialized or when its freed
respectively. This prevents the afu from being released until all the
afu contexts have been released. The patch introduces two new functions
namely cxl_afu_get/put that manage the ref-count on the afu device.

Also the patch removes code inside cxl_dev_context_init that increases ref
on the afu device as its guaranteed to be alive during this function.
Reported-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1b5df59e

01 10月, 2015 1 次提交

cxl: fix leak of IRQ names in cxl_free_afu_irqs() · 8dde152e

由 Andrew Donnellan 提交于 9月 30, 2015

cxl_free_afu_irqs() doesn't free IRQ names when it releases an AFU's IRQ
ranges. The userspace API equivalent in afu_release_irqs() calls
afu_irq_name_free() to release the IRQ names.

Call afu_irq_name_free() in cxl_free_afu_irqs() to release the IRQ names.
Make afu_irq_name_free() non-static to allow this.
Reported-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
Fixes: 6f7f0b3d ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8dde152e

30 8月, 2015 2 次提交

cxl: Fix force unmapping mmaps of contexts allocated through the kernel api · 55e07668

由 Ian Munsie 提交于 8月 27, 2015

The cxl user api uses the address_space associated with the file when we
need to force unmap all cxl mmap regions (e.g. on eeh, driver detach,
etc). Currently, contexts allocated through the kernel api do not do
this and instead skip the mmap invalidation, potentially allowing them
to poke at the hardware after such an event, which may cause all sorts
of trouble.

This patch allocates an address_space for cxl contexts allocated through
the kernel api so that the same invalidate path will for these contexts
as well. We don't use the anonymous inode's address_space, as doing so
could invalidate any mmaps of completely unrelated drivers using
anonymous file descriptors.

This patch also introduces a kernelapi flag, so we know when freeing the
context if the address_space was allocated by us and needs to be freed.
Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

55e07668

cxl: Fix + cleanup error paths in cxl_dev_context_init · af2a50bb

由 Ian Munsie 提交于 8月 27, 2015

If the cxl_context_alloc() call fails, we return immediately without
releasing the reference on the AFU device, allowing it to leak.

This patch switches to using goto style error handling so that the
device is released in common code for both error paths, and will also
simplify things if we add additional initialisation in this function in
the future.
Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

af2a50bb

20 8月, 2015 1 次提交

cxl: Allow release of contexts which have been OPENED but not STARTED · 7c26b9cf

由 Andrew Donnellan 提交于 8月 19, 2015

If we open a context but do not start it (either because we do not attempt
to start it, or because it fails to start for some reason), we are left
with a context in state OPENED. Previously, cxl_release_context() only
allowed releasing contexts in state CLOSED, so attempting to release an
OPENED context would fail.

In particular, this bug causes available contexts to run out after some EEH
failures, where drivers attempt to release contexts that have failed to
start.

Allow releasing contexts in any state with a value lower than STARTED, i.e.
OPENED or CLOSED (we can't release a STARTED context as it's currently
using the hardware, and we assume that contexts in any new states which may
be added in future with a value higher than STARTED are also unsafe to
release).

Cc: stable@vger.kernel.org
Fixes: 6f7f0b3d ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NDaniel Axtens <dja@axtens.net>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7c26b9cf

14 8月, 2015 1 次提交

cxl: Allow the kernel to trust that an image won't change on PERST. · 13e68d8b

由 Daniel Axtens 提交于 8月 14, 2015

Provide a kernel API and a sysfs entry which allow a user to specify
that when a card is PERSTed, it's image will stay the same, allowing
it to participate in EEH.

cxl_reset is used to reflash the card. In that case, we cannot safely
assert that the image will not change. Therefore, disallow cxl_reset
if the flag is set.
Signed-off-by: NDaniel Axtens <dja@axtens.net>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

13e68d8b

07 7月, 2015 1 次提交

cxl: Fix refcounting in kernel API · 3f8dc44d

由 Michael Neuling 提交于 7月 07, 2015

Currently the kernel API AFU dev refcounting is done on context start and stop.
This patch moves this refcounting to context init and release, bringing it
inline with how the userspace API does it.

Without this we've seen the refcounting on the AFU get out of whack between the
user and kernel API usage.  This causes the AFU structures to be freed when
they are actually still in use.

This fixes some kref warnings we've been seeing and spurious ErrIVTE IRQs.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3f8dc44d

03 6月, 2015 1 次提交

cxl: Add AFU virtual PHB and kernel API · 6f7f0b3d

由 Michael Neuling 提交于 5月 27, 2015

This patch does two things.

Firstly it presents the Accelerator Function Unit (AFUs) behind the POWER
Service Layer (PSL) as PCI devices on a virtual PCI Host Bridge (vPHB). This
in in addition to the PSL being a PCI device itself.

As part of the Coherent Accelerator Interface Architecture (CAIA) AFUs can
provide an AFU configuration. This AFU configuration recored is architected to
be the same as a PCI config space.

This patch sets discovers the AFU configuration records, provides AFU config
space read/write functions to these configuration records. It then enumerates
the PCI bus. It also hooks in PCI ops where appropriate. It also destroys the
vPHB when the physical card is removed.

Secondly, it add an in kernel API for AFU to use CXL. AFUs must present a
driver that firstly binds as a PCI device. This PCI device can then be using
to do CXL specific operations (that can't sit in the PCI ops) using this API.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6f7f0b3d

openanolis / cloud-kernel 大约 2 年 前同步成功

openanolis / cloud-kernel
大约 2 年前同步成功