提交 · 828c5711cc7911d4651ebfe0cef369d465fc8f12 · openeuler / raspberrypi-kernel

27 12月, 2019 1 次提交

cxl: Wrap iterations over afu slices inside 'afu_list_lock' · 828c5711

由 Vaibhav Jain 提交于 1月 29, 2019

commit edeb304f659792fb5bab90d7d6f3408b4c7301fb upstream.

Within cxl module, iteration over array 'adapter->afu' may be racy
at few points as it might be simultaneously read during an EEH and its
contents being set to NULL while driver is being unloaded or unbound
from the adapter. This might result in a NULL pointer to 'struct afu'
being de-referenced during an EEH thereby causing a kernel oops.

This patch fixes this by making sure that all access to the array
'adapter->afu' is wrapped within the context of spin-lock
'adapter->afu_list_lock'.

Fixes: 9e8df8a2 ("cxl: EEH support")
Cc: stable@vger.kernel.org # v4.3+
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.ibm.com>
Acked-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: NVaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

828c5711

16 7月, 2018 1 次提交

powerpc/64s: Remove POWER9 DD1 support · 2bf1071a

由 Nicholas Piggin 提交于 7月 05, 2018

POWER9 DD1 was never a product. It is no longer supported by upstream
firmware, and it is not effectively supported in Linux due to lack of
testing.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NMichael Ellerman <mpe@ellerman.id.au>
[mpe: Remove arch_make_huge_pte() entirely]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2bf1071a

02 7月, 2018 4 次提交

cxl: Remove abandonned capi support for the Mellanox CX4, final cleanup · f3988ca4

由 Frederic Barrat 提交于 6月 28, 2018

Remove a few XSL/CX4 oddities which are no longer needed. A simple
revert of the initial commits was not possible (or not worth it) due
to the history of the code.
Signed-off-by: NFrederic Barrat <fbarrat@linux.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f3988ca4

Revert "cxl: Add cxl_slot_is_supported API" · 322dc4af

由 Frederic Barrat 提交于 6月 28, 2018

Remove abandonned capi support for the Mellanox CX4.

This reverts commit 4e56f858.
Signed-off-by: NFrederic Barrat <fbarrat@linux.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

322dc4af

Revert "cxl: Add support for using the kernel API with a real PHB" · c8d43cf0

由 Alastair D'Silva 提交于 6月 28, 2018

Remove abandonned capi support for the Mellanox CX4.

This reverts commit 317f5ef1.
Signed-off-by: NAlastair D'Silva <alastair@d-silva.org>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c8d43cf0

Revert "cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cards" · 29fea8aa

由 Alastair D'Silva 提交于 6月 28, 2018

Remove abandonned capi support for the Mellanox CX4.

This reverts commit b0b5e591.
Signed-off-by: NAlastair D'Silva <alastair@d-silva.org>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

29fea8aa

01 6月, 2018 1 次提交

cxl: Configure PSL to not use APC virtual machines · 9a6d2022

由 Vaibhav Jain 提交于 4月 17, 2018

APC virtual machines arent used on POWER-9 chips and are already
disabled in on-chip CAPP. They also need to be disabled on the PSL via
'PSL Data Send Control Register' by setting bit(47). This forces the
PSL to send commands to CAPP with queue.id == 0.

Fixes: 56328743 ("cxl: Add support for POWER9 DD2")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: NAlastair D'Silva <alastair@d-silva.org>
Reviewed-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9a6d2022

15 5月, 2018 2 次提交

cxl: Report the tunneled operations status · 497a0790

由 Philippe Bergheaud 提交于 5月 14, 2018

Failure to synchronize the tunneled operations does not prevent
the initialization of the cxl card. This patch reports the tunneled
operations status via /sys.
Signed-off-by: NPhilippe Bergheaud <felix@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

497a0790

cxl: Set the PBCQ Tunnel BAR register when enabling capi mode · 401dca8c

由 Philippe Bergheaud 提交于 5月 14, 2018

Skiboot used to set the default Tunnel BAR register value when capi
mode was enabled. This approach was ok for the cxl driver, but
prevented other drivers from choosing different values.

Skiboot versions > 5.11 will not set the default value any longer.
This patch modifies the cxl driver to set/reset the Tunnel BAR
register when entering/exiting the cxl mode, with
pnv_pci_set_tunnel_bar().

That should work with old skiboot (since we are re-writing the value
already set) and new skiboot.

mpe: The tunnel support was only merged into Linux recently, in commit
d6a90bb8 ("powerpc/powernv: Enable tunneled operations")
(v4.17-rc1), so with new skiboot kernels between that commit and this
will not work correctly.

Fixes: d6a90bb8 ("powerpc/powernv: Enable tunneled operations")
Signed-off-by: NPhilippe Bergheaud <felix@linux.ibm.com>
Reviewed-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

401dca8c

14 3月, 2018 1 次提交

cxl: Fix timebase synchronization status on P9 · c2be663d

由 Christophe Lombard 提交于 2月 20, 2018

The PSL Timebase register is updated by the PSL to maintain the
timebase.

On P9, the Timebase value is only provided by the CAPP as received the
last time a timebase request was performed.

The timebase requests are initiated through the adapter configuration
or application registers.

The specific sysfs entry "/sys/class/cxl/cardxx/psl_timebase_synced"
is now dynamically updated according the content of the PSL Timebase
register.

Fixes: f24be42a ("cxl: Add psl9 specific code")
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c2be663d

13 3月, 2018 4 次提交

cxl: read PHB indications from the device tree · 9dbcbfa1

由 Philippe Bergheaud 提交于 3月 02, 2018

Configure the P9 XSL_DSNCTL register with PHB indications found
in the device tree, or else use legacy hard-coded values.
Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com>
Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9dbcbfa1

cxl: Check if PSL data-cache is available before issue flush request · 94322ed8

由 Vaibhav Jain 提交于 2月 15, 2018

PSL9D doesn't have a data-cache that needs to be flushed before
resetting the card. However when cxl tries to flush data-cache on such
a card, it times-out as PSL_Control register never indicates flush
operation complete due to missing data-cache. This is usually
indicated in the kernel logs with this message:

"WARNING: cache flush timed out"

To fix this the patch checks PSL_Debug register CDC-Field(BIT:27)
which indicates the absence of a data-cache and sets a flag
'no_data_cache' in 'struct cxl_native' to indicate this. When
cxl_data_cache_flush() is called it checks the flag and if set bails
out early without requesting a data-cache flush operation to the PSL.
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

94322ed8

cxl: Remove function write_timebase_ctrl_psl9() for PSL9 · 02b63b42

由 Vaibhav Jain 提交于 2月 15, 2018

For PSL9 the contents of PSL_TB_CTLSTAT register have changed in PSL9
and all of the register is now readonly. Hence we don't need an sl_ops
implementation for 'write_timebase_ctrl' for to populate this register
for PSL9.

Hence this patch removes function write_timebase_ctrl_psl9() and its
references from the code.
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

02b63b42

cxl: Enable NORST bit in PSL_DEBUG register for PSL9 · 03ebb419

由 Vaibhav Jain 提交于 2月 09, 2018

We enable the NORST bit by default for debug afu images to prevent
reset of AFU trace-data on a PCI link drop. For production AFU images
this bit is always ignored and PSL gets reconfigured anyways thereby
resetting the trace data. So setting this bit for non-debug images
doesn't have any impact.
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

03ebb419

24 1月, 2018 1 次提交

cxl: Remove support for "Processing accelerators" class · 741ddae6

由 Frederic Barrat 提交于 1月 23, 2018

The cxl driver currently declares in its table of supported PCI
devices the class "Processing accelerators". Therefore it may be
called to probe for opencapi devices, which generates errors, as the
config space of a cxl device is not compatible with opencapi.

So remove support for the generic class, as we now have (at least) two
drivers for devices of the same class. Most cxl devices are FPGAs with
a PSL which will show a known device ID of 0x477. Other devices are
really supported by the cxlflash driver and are already listed in the
table. So removing the class is expected to go unnoticed.
Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

741ddae6

23 11月, 2017 1 次提交

cxl: Check if vphb exists before iterating over AFU devices · 12841f87

由 Vaibhav Jain 提交于 11月 23, 2017

During an eeh a kernel-oops is reported if no vPHB is allocated to the
AFU. This happens as during AFU init, an error in creation of vPHB is
a non-fatal error. Hence afu->phb should always be checked for NULL
before iterating over it for the virtual AFU pci devices.

This patch fixes the kenel-oops by adding a NULL pointer check for
afu->phb before it is dereferenced.

Fixes: 9e8df8a2 ("cxl: EEH support")
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

12841f87

06 11月, 2017 1 次提交

cxl: Rework the implementation of cxl_stop_trace_psl9() · cbb55eeb

由 Vaibhav Jain 提交于 10月 11, 2017

Presently the PSL9 specific cxl_stop_trace_psl9() only stops the RX0
traces on the CXL adapter when a PSL error irq is triggered. The patch
updates the function to stop all the traces arrays and move them to
the FIN state. The implementation issues the mmio to TRACECFG register
to stop the trace array iff it already not in FIN state. This prevents
the issue of trace data being reset in case of multiple stop mmio
issued for a single trace array.

Also the patch does some refactoring of existing cxl_stop_trace_psl9()
and cxl_stop_trace_psl8() functions by moving them to 'pci.c' from
'debugfs.c' file and marking them as static.
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

cbb55eeb

13 10月, 2017 1 次提交

cxl: Dump PSL_FIR register on PSL9 error irq · 990f19ae

由 Vaibhav Jain 提交于 10月 11, 2017

For PSL9 currently we aren't dumping the PSL FIR register when a
PSL error interrupt is triggered. Contents of this register are useful
in debugging AFU issues.

This patch fixes issue by adding a new service_layer_ops callback
cxl_native_err_irq_dump_regs_psl9() to dump the PSL_FIR registers on a
PSL error interrupt thereby bringing the behavior in line with PSL on
POWER-8. Also the existing service_layer_ops callback
for PSL8 has been renamed to cxl_native_err_irq_dump_regs_psl8().
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

990f19ae

06 10月, 2017 1 次提交

cxl: Add support for POWER9 DD2 · 56328743

由 Christophe Lombard 提交于 9月 08, 2017

The PSL initialization sequence has been updated to DD2.
This patch adapts to the changes, retaining compatibility with DD1.
The patch includes some changes to DD1 fix-ups as well.

Tests performed on some of the old/new hardware.

The function is_page_fault(), for POWER9, lists the Translation Checkout
Responses where the page fault will be handled by copro_handle_mm_fault().
This list is too restrictive and not necessary.

This patches removes this restriction and all page faults, whatever the
reason, will be handled. In this case, the interruption is always
acknowledged.

The following features will be added soon:
- phb reset when switching to capi mode.
- cxllib update to support new functions.
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

56328743

14 9月, 2017 1 次提交

mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4

由 Michal Hocko 提交于 9月 13, 2017

GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation.  As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory.  So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification.  I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring.  This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL.  Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
seems to be a heuristic without any measured advantage for most (if not
all) its current users.  The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers.  So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
  Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NMel Gorman <mgorman@suse.de>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0ee931c4

03 7月, 2017 1 次提交

cxl: Export library to support IBM XSL · 3ced8d73

由 Christophe Lombard 提交于 6月 22, 2017

This patch exports a in-kernel 'library' API which can be called by
other drivers to help interacting with an IBM XSL on a POWER9 system.

The XSL (Translation Service Layer) is a stripped down version of the
PSL (Power Service Layer) used in some cards such as the Mellanox CX5.
Like the PSL, it implements the CAIA architecture, but has a number
of differences, mostly in it's implementation dependent registers.

The XSL also uses a special DMA cxl mode, which uses a slightly
different init sequence for the CAPP and PHB.
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3ced8d73

23 6月, 2017 1 次提交

cxl: Fixes for Coherent Accelerator Interface Architecture 2.0 · 797625de

由 Christophe Lombard 提交于 6月 13, 2017

A previous set of patches "cxl: Add support for Coherent Accelerator
Interface Architecture 2.0" has introduced a new support for the CAPI
cards. These patches have been tested on Simulation environment and
quite a bit of them have been tested on real hardware.

This patch brings new fixes after a series of tests carried out on new
equipment:
  - Add POWER9 definition.
  - Re-enable any masked interrupts when the AFU is not activated
    after resetting the AFU.
  - Remove the api cxl_is_psl8/9 which is no longer useful.
  - Do not dump CAPI1 registers.
  - Rewrite cxl_is_page_fault() function.
  - Do not register slb callack on P9.

Fixes: f24be42a ("cxl: Add psl9 specific code")
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

797625de

02 5月, 2017 2 次提交

cxl: Route eeh events to all drivers in cxl_pci_error_detected() · 4f58f0bf

由 Vaibhav Jain 提交于 4月 27, 2017

Fix a boundary condition where in some cases an eeh event that results
in card reset isn't passed on to a driver attached to the virtual PCI
device associated with a slice. This will happen in case when a slice
attached device driver returns a value other than
PCI_ERS_RESULT_NEED_RESET from the eeh error_detected() callback. This
would result in an early return from cxl_pci_error_detected() and
other drivers attached to other AFUs on the card wont be notified.

The patch fixes this by making sure that all slice attached
device-drivers are notified and the return values from
error_detected() callback are aggregated in a scheme where request for
'disconnect' trumps all and 'none' trumps 'need_reset'.

Fixes: 9e8df8a2 ("cxl: EEH support")
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4f58f0bf

cxl: Force context lock during EEH flow · ea9a26d1

由 Vaibhav Jain 提交于 4月 27, 2017

During an eeh event when the cxl card is fenced and card sysfs attr
perst_reloads_same_image is set following warning message is seen in the
kernel logs:

  Adapter context unlocked with 0 active contexts
  ------------[ cut here ]------------
  WARNING: CPU: 12 PID: 627 at
  ../drivers/misc/cxl/main.c:325 cxl_adapter_context_unlock+0x60/0x80 [cxl]

Even though this warning is harmless, it clutters the kernel log
during an eeh event. This warning is triggered as the EEH callback
cxl_pci_error_detected doesn't obtain a context-lock before forcibly
detaching all active context and when context-lock is released during
call to cxl_configure_adapter from cxl_pci_slot_reset, a warning in
cxl_adapter_context_unlock is triggered.

To fix this warning, we acquire the adapter context-lock via
cxl_adapter_context_lock() in the eeh callback
cxl_pci_error_detected() once all the virtual AFU PHBs are notified
and their contexts detached. The context-lock is released in
cxl_pci_slot_reset() after the adapter is successfully reconfigured
and before the we call the slot_reset callback on slice attached
device-drivers.

Fixes: 70b565bb ("cxl: Prevent adapter reset if an active context exists")
Cc: stable@vger.kernel.org # v4.9+
Reported-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
Tested-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ea9a26d1

19 4月, 2017 1 次提交

cxl: Enable PCI device IDs for future IBM CXL adapters · 41e20d95

由 Matthew R. Ochs 提交于 3月 24, 2017

Add support for future IBM Coherent Accelerator (CXL) devices
with an IDs of 0x0623 and 0x0628.
Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

41e20d95

13 4月, 2017 5 次提交

cxl: Add psl9 specific code · f24be42a

由 Christophe Lombard 提交于 4月 12, 2017

The new Coherent Accelerator Interface Architecture, level 2, for the
IBM POWER9 brings new content and features:
- POWER9 Service Layer
- Registers
- Radix mode
- Process element entry
- Dedicated-Shared Process Programming Model
- Translation Fault Handling
- CAPP
- Memory Context ID
    If a valid mm_struct is found the memory context id is used for each
    transaction associated with the process handle. The PSL uses the
    context ID to find the corresponding process element.
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
[mpe: Fixup comment formatting, unsplit long strings]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f24be42a

cxl: Isolate few psl8 specific calls · abd1d99b

由 Christophe Lombard 提交于 4月 07, 2017

Point out the specific Coherent Accelerator Interface Architecture,
level 1, registers.
Code and functions specific to PSL8 (CAIA1) must be framed.
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
[mpe: Don't split long strings, it makes them hard to grep for]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

abd1d99b

cxl: Rename some psl8 specific functions · 64663f37

由 Christophe Lombard 提交于 4月 07, 2017

Rename a few functions, changing the '_psl' suffix to '_psl8', to make
clear that the implementation is psl8 specific.
Those functions will have an equivalent implementation for the psl9 in
a later patch.
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

64663f37

cxl: Update implementation service layer · bdd2e715

由 Christophe Lombard 提交于 4月 07, 2017

The service layer API (in cxl.h) lists some low-level functions whose
implementation is different on PSL8, PSL9 and XSL:
- Init implementation for the adapter and the afu.
- Invalidate TLB/SLB.
- Attach process for dedicated/directed models.
- Handle psl interrupts.
- Debug registers for the adapter and the afu.
- Traces.
Each environment implements its own functions, and the common code uses
them through function pointers, defined in cxl_service_layer_ops.
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bdd2e715

cxl: Read vsec perst load image · aba81433

由 Christophe Lombard 提交于 4月 07, 2017

This bit is used to cause a flash image load for programmable
CAIA-compliant implementation. If this bit is set to ‘0’, a power
cycle of the adapter is required to load a programmable CAIA-com-
pliant implementation from flash.
This field will be used by the following patches.
Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aba81433

20 3月, 2017 1 次提交

cxl: Route eeh events to all slices for pci_channel_io_perm_failure state · 07f5ab60

由 Vaibhav Jain 提交于 2月 23, 2017

Fix a boundary condition where in some cases an eeh event with state ==
pci_channel_io_perm_failure wont be passed on to a driver attached to
the virtual PCI device associated with a slice. This will happen in case
the slice just before (n-1) doesn't have any vPHB bus associated with
it, that results in an early return from cxl_pci_error_detected()
callback.

With state == pci_channel_io_perm_failure, the adapter will be removed
irrespective of the return value of cxl_vphb_error_detected(). So we now
always return PCI_ERS_RESULT_DISCONNECTED for this case i.e even if
the AFU isn't using a vPHB (currently returns PCI_ERS_RESULT_NONE).

Fixes: e4f5fc00("cxl: Do not create vPHB if there are no AFU configuration records")
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

07f5ab60

21 2月, 2017 1 次提交

cxl: fix nested locking hang during EEH hotplug · 171ed0fc

由 Andrew Donnellan 提交于 2月 06, 2017

Commit 14a3ae34 ("cxl: Prevent read/write to AFU config space while AFU
not configured") introduced a rwsem to fix an invalid memory access that
occurred when someone attempts to access the config space of an AFU on a
vPHB whilst the AFU is deconfigured, such as during EEH recovery.

It turns out that it's possible to run into a nested locking issue when EEH
recovery fails and a full device hotplug is required.
cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on
configured_rwsem. When EEH recovery fails, the EEH code calls
pci_hp_remove_devices() to remove the device, which in turn calls
cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries
to grab the writer lock that's already held.

Standard rwsem semantics don't express what we really want to do here and
don't allow for nested locking. Fix this by replacing the rwsem with an
atomic_t which we can control more finely. Allow the AFU to be locked
multiple times so long as there are no readers.

Fixes: 14a3ae34 ("cxl: Prevent read/write to AFU config space while AFU not configured")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

171ed0fc

25 1月, 2017 2 次提交

cxl: Prevent read/write to AFU config space while AFU not configured · 14a3ae34

由 Andrew Donnellan 提交于 12月 09, 2016

During EEH recovery, we deconfigure all AFUs whilst leaving the
corresponding vPHB and virtual PCI device in place.

If something attempts to interact with the AFU's PCI config space (e.g.
running lspci) after the AFU has been deconfigured and before it's
reconfigured, cxl_pcie_{read,write}_config() will read invalid values from
the deconfigured struct cxl_afu and proceed to Oops when they try to
dereference pointers that have been set to NULL during deconfiguration.

Add a rwsem to struct cxl_afu so we can prevent interaction with config
space while the AFU is deconfigured.
Reported-by: NPradipta Ghosh <pradghos@in.ibm.com>
Suggested-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

14a3ae34

cxl: Force psl data-cache flush during device shutdown · d7b1946c

由 Vaibhav Jain 提交于 1月 04, 2017

This change adds a force psl data cache flush during device shutdown
callback. This should reduce a possibility of psl holding a dirty
cache line while the CAPP is being reinitialized, which may result in
a UE [load/store] machine check error.
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d7b1946c

18 11月, 2016 1 次提交

cxl: Fix error handling in _cxl_pci_associate_default_context() · bb81733d

由 Christophe Jaillet 提交于 10月 30, 2016

'cxl_dev_context_init()' returns an error pointer in case of error, not
NULL. So test it with IS_ERR.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bb81733d

19 10月, 2016 1 次提交

cxl: Prevent adapter reset if an active context exists · 70b565bb

由 Vaibhav Jain 提交于 10月 14, 2016

This patch prevents resetting the cxl adapter via sysfs in presence of
one or more active cxl_context on it. This protects against an
unrecoverable error caused by PSL owning a dirty cache line even after
reset and host tries to touch the same cache line. In case a force reset
of the card is required irrespective of any active contexts, the int
value -1 can be stored in the 'reset' sysfs attribute of the card.

The patch introduces a new atomic_t member named contexts_num inside
struct cxl that holds the number of active context attached to the card
, which is checked against '0' before proceeding with the reset. To
prevent against a race condition where a context is activated just after
reset check is performed, the contexts_num is atomically set to '-1'
after reset-check to indicate that no more contexts can be activated on
the card anymore.

Before activating a context we atomically test if contexts_num is
non-negative and if so, increment its value by one. In case the value of
contexts_num is negative then it indicates that the card is about to be
reset and context activation is error-ed out at that point.

Fixes: 62fa19d4 ("cxl: Add ability to reset the card")
Cc: stable@vger.kernel.org # v4.0+
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

70b565bb

04 10月, 2016 1 次提交

cxl: Flush PSL cache before resetting the adapter · aaa2245e

由 Frederic Barrat 提交于 10月 03, 2016

If the capi link is going down while the PSL owns a dirty cache line,
any access from the host for that data could lead to an Uncorrectable
Error.

So when resetting the capi adapter through sysfs, make sure the PSL
cache is flushed. It won't help if there are any active Process
Elements on the card, as the cache would likely get new dirty cache
lines immediately, but if resetting an idle adapter, it should avoid
any bad surprises from data left over from terminated Process Elements.
Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aaa2245e

13 9月, 2016 1 次提交

cxl: Fix informational message · b135077b

由 Frederic Barrat 提交于 9月 12, 2016

When set_sl_ops() is called, the adapter data structure is not fully
initialized yet. Therefore the device name is not showing up in the
trace. Fix is simply to get the device name from the pci_dev
structure.

Fixes: 6d382616 ("cxl: Abstract the differences between the PSL and XSL")
Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b135077b

10 8月, 2016 1 次提交

cxl: Set psl_fir_cntl to production environment value · c6d2ee09

由 Frederic Barrat 提交于 8月 08, 2016

Switch the setting of psl_fir_cntl from debug to production
environment recommended value. It mostly affects the PSL behavior when
an error is raised in psl_fir1/2.

Tested with cxlflash.
Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c6d2ee09

09 8月, 2016 1 次提交

cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests · 16479337

由 Andrew Donnellan 提交于 7月 28, 2016

Commit f67a6722 ("cxl: Workaround PE=0 hardware limitation in
Mellanox CX4") added a "min_pe" field to struct cxl_service_layer_ops,
to allow us to work around a Mellanox CX-4 hardware limitation.

When allocating the PE number in cxl_context_init(), we read from
ctx->afu->adapter->native->sl_ops->min_pe to get the minimum PE number.
Unsurprisingly, in a PowerVM guest ctx->afu->adapter->native is NULL,
and guests don't have a cxl_service_layer_ops struct anywhere.

Move min_pe from struct cxl_service_layer_ops to struct cxl so it's
accessible in both native and PowerVM environments. For the Mellanox
CX-4, set the min_pe value in set_sl_ops().

Fixes: f67a6722 ("cxl: Workaround PE=0 hardware limitation in Mellanox CX4")
Reported-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

16479337