提交 · 82715a0f332843d3a1830d7ebc9ac7c99a00c880 · openeuler / Kernel

20 8月, 2020 1 次提交

powerpc/powernv/pci: Fix possible crash when releasing DMA resources · e17a7c0e

由 Frederic Barrat 提交于 8月 19, 2020

Fix a typo introduced during recent code cleanup, which could lead to
silently not freeing resources or an oops message (on PCI hotplug or
CAPI reset).

Only impacts ioda2, the code path for ioda1 is correct.

Fixes: 01e12629 ("powerpc/powernv/pci: Add explicit tracking of the DMA setup state")
Signed-off-by: NFrederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200819130741.16769-1-fbarrat@linux.ibm.com

e17a7c0e

03 8月, 2020 1 次提交

powerpc/powernv/sriov: Fix use of uninitialised variable · 2075ec98

由 Oliver O'Halloran 提交于 8月 03, 2020

Initialising the value before using it is generally regarded as a good
idea so do that.

Fixes: 4c51f3e1 ("powerpc/powernv/sriov: Make single PE mode a per-BAR setting")
Reported-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200803075408.132601-1-oohall@gmail.com

2075ec98

29 7月, 2020 2 次提交

powerpc/powernv/sriov: Remove unused but set variable 'phb' · cf1ae052

由 Wei Yongjun 提交于 7月 28, 2020

Gcc report warning as follows:

arch/powerpc/platforms/powernv/pci-sriov.c:602:25: warning:
 variable 'phb' set but not used [-Wunused-but-set-variable]
  602 |  struct pnv_phb        *phb;
      |                         ^~~

This variable is not used, so this commit removing it.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Acked-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727171112.2781-1-weiyongjun1@huawei.com

cf1ae052

powerpc: Use fallthrough pseudo-keyword · 5e66a0cb

由 Gustavo A. R. Silva 提交于 7月 27, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727224201.GA10133@embeddedor

5e66a0cb

26 7月, 2020 24 次提交

powerpc/powernv/pci.h: delete duplicated word · 86052e40

由 Randy Dunlap 提交于 7月 25, 2020

Drop the repeated word "for".
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200726003809.20454-10-rdunlap@infradead.org

86052e40

powerpc/powernv/sriov: Remove vfs_expanded · 84d8505e

由 Oliver O'Halloran 提交于 7月 22, 2020

Previously iov->vfs_expanded was used for two purposes.

1) To work out how much we need to multiple the per-VF BAR size to figure
   out the total space required for the IOV BAR.

2) To indicate that IOV is not usable with this device (vfs_expanded == 0).

We don't really need the field for either since the multiple in 1) is
always the number PEs supported by the PHB. Similarly, we don't really need
it in 2) either since the IOV data field will be NULL if we can't use IOV
with the device.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-16-oohall@gmail.com

84d8505e

powerpc/powernv/sriov: Make single PE mode a per-BAR setting · 4c51f3e1

由 Oliver O'Halloran 提交于 7月 22, 2020

Using single PE BARs to map an SR-IOV BAR is really a choice about what
strategy to use when mapping a BAR. It doesn't make much sense for this to
be a global setting since a device might have one large BAR which needs to
be mapped with single PE windows and another smaller BAR that can be mapped
with a regular segmented window. Make the segmented vs single decision a
per-BAR setting and clean up the logic that decides which mode to use.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-15-oohall@gmail.com

4c51f3e1

powerpc/powernv/sriov: Refactor M64 BAR setup · a0be516f

由 Oliver O'Halloran 提交于 7月 22, 2020

Split up the logic so that we have one branch that handles setting up a
segmented window and another that handles setting up single PE windows for
each VF.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-14-oohall@gmail.com

a0be516f

powerpc/powernv/sriov: Move M64 BAR allocation into a helper · 39efc03e

由 Oliver O'Halloran 提交于 7月 22, 2020

I want to refactor the loop this code is currently inside of. Hoist it on
out.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-13-oohall@gmail.com

39efc03e

powerpc/powernv/sriov: De-indent setup and teardown · 052da31d

由 Oliver O'Halloran 提交于 7月 22, 2020

Remove the IODA2 PHB checks. We already assume IODA2 in several places so
there's not much point in wrapping most of the setup and teardown process
in an if block.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-12-oohall@gmail.com

052da31d

powerpc/powernv/sriov: Drop iov->pe_num_map[] · d29a2488

由 Oliver O'Halloran 提交于 7月 22, 2020

Currently the iov->pe_num_map[] does one of two things depending on
whether single PE mode is being used or not. When it is, this contains an
array which maps a vf_index to the corresponding PE number. When single PE
mode is not being used this contains a scalar which is the base PE for the
set of enabled VFs (for for VFn is base + n).

The array was necessary because when calling pnv_ioda_alloc_pe() there is
no guarantee that the allocated PEs would be contigious. We can now
allocate contigious blocks of PEs so this is no longer an issue. This
allows us to drop the if (single_mode) {} .. else {} block scattered
through the SR-IOV code which is a nice clean up.

This also fixes a bug in pnv_pci_sriov_disable() which is the non-atomic
bitmap_clear() to manipulate the PE allocation map. Other users of the map
assume it will be accessed with atomic ops.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-11-oohall@gmail.com

d29a2488

powerpc/powernv/pci: Refactor pnv_ioda_alloc_pe() · a4bc676e

由 Oliver O'Halloran 提交于 7月 22, 2020

Rework the PE allocation logic to allow allocating blocks of PEs rather
than individually. We'll use this to allocate contigious blocks of PEs for
the SR-IOVs.

This patch also adds code to pnv_ioda_alloc_pe() and pnv_ioda_reserve_pe() to
use the existing, but unused, phb->pe_alloc_mutex. Currently these functions
use atomic bit ops to release a currently allocated PE number. However,
the pnv_ioda_alloc_pe() wants to have exclusive access to the bit map while
scanning for hole large enough to accomodate the allocation size.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-10-oohall@gmail.com

a4bc676e

powerpc/powernv/sriov: Factor out M64 BAR setup · a610d35c

由 Oliver O'Halloran 提交于 7月 22, 2020

The sequence required to use the single PE BAR mode is kinda janky and
requires a little explanation. The API was designed with P7-IOC style
windows where the setup process is something like:

1. Configure the window start / end address
2. Enable the window
3. Map the segments of each window to the PE

For Single PE BARs the process is:

1. Set the PE for segment zero on a disabled window
2. Set the range
3. Enable the window

Move the OPAL calls into their own helper functions where the quirks can be
contained.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-9-oohall@gmail.com

a610d35c

powerpc/powernv/sriov: Simplify used window tracking · ad9add52

由 Oliver O'Halloran 提交于 7月 22, 2020

No need for the multi-dimensional arrays, just use a bitmap.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-8-oohall@gmail.com

ad9add52

powerpc/powernv/sriov: Rename truncate_iov · fac248f8

由 Oliver O'Halloran 提交于 7月 22, 2020

This prevents SR-IOV being used by making the SR-IOV BAR resources
unallocatable. Rename it to reflect what it actually does.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-7-oohall@gmail.com

fac248f8

powerpc/powernv/sriov: Explain how SR-IOV works on PowerNV · ff79e11a

由 Oliver O'Halloran 提交于 7月 22, 2020

SR-IOV support on PowerNV is a byzantine maze of hooks. I have no idea
how anyone is supposed to know how it works except through a lot of
stuffering. Write up some docs about the overall story to help out
the next sucker^Wperson who needs to tinker with it.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-6-oohall@gmail.com

ff79e11a

powerpc/powernv/sriov: Move SR-IOV into a separate file · 37b59ef0

由 Oliver O'Halloran 提交于 7月 22, 2020

pci-ioda.c is getting a bit unwieldly due to the amount of stuff jammed in
there. The SR-IOV support can be extracted easily enough and is mostly
standalone, so move it into a separate file.

This patch also moves the PowerNV SR-IOV specific fields from pci_dn and
moves them into a platform specific structure. I'm not sure how they ended
up in there in the first place, but leaking platform specifics into common
code has proven to be a terrible idea so far so lets stop doing that.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-5-oohall@gmail.com

37b59ef0

powerpc/powernv/pci: Initialise M64 for IODA1 as a 1-1 window · 36963365

由 Oliver O'Halloran 提交于 7月 22, 2020

We pre-configure the m64 window for IODA1 as a 1-1 segment-PE mapping,
similar to PHB3. Currently the actual mapping of segments occurs in
pnv_ioda_pick_m64_pe(), but we can move it into pnv_ioda1_init_m64() and
drop the IODA1 specific code paths in the PE setup / teardown.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-4-oohall@gmail.com

36963365

powerpc/powernv/pci: Add explicit tracking of the DMA setup state · 01e12629

由 Oliver O'Halloran 提交于 7月 22, 2020

There's an optimisation in the PE setup which skips performing DMA
setup for a PE if we only have bridges in a PE. The assumption being
that only "real" devices will DMA to system memory, which is probably
fair. However, if we start off with only bridge devices in a PE then
add a non-bridge device the new device won't be able to use DMA because
we never configured it.

Fix this (admittedly pretty weird) edge case by tracking whether we've done
the DMA setup for the PE or not. If a non-bridge device is added to the PE
(via rescan or hotplug, or whatever) we can set up DMA on demand.

This also means the only remaining user of the old "DMA Weight" code is
the IODA1 DMA setup code that it was originally added for, which is good.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-3-oohall@gmail.com

01e12629

powerpc/powernv/pci: Always tear down DMA windows on PE release · 7a52ffab

由 Oliver O'Halloran 提交于 7月 22, 2020

Currently we have these two functions:

	pnv_pci_ioda2_release_dma_pe(), and
	pnv_pci_ioda2_release_pe_dma()

The first is used when tearing down VF PEs and the other is used for normal
devices. There's very little difference between the two though. The latter
(non-VF) will skip a call to pnv_pci_ioda2_unset_window() unless
CONFIG_IOMMU_API=y is set. There's no real point in doing this so fold the
two together.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-2-oohall@gmail.com

7a52ffab

powerpc/powernv/pci: Add pci_bus_to_pnvhb() helper · 5609ffdd

由 Oliver O'Halloran 提交于 7月 22, 2020

Add a helper to go from a pci_bus structure to the pnv_phb that hosts
that bus. There's a lot of instances of the following pattern:

	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
	struct pnv_phb *phb = hose->private_data;

Without any other uses of the pci_controller inside the function. This
is hard to read since it requires you to memorise the contents of the
private data fields and kind of error prone since it involves blindly
assigning a void pointer. Add a helper to make it more concise and
explicit.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200722065715.1432738-1-oohall@gmail.com

5609ffdd

powerpc/eeh: Move PE tree setup into the platform · a131bfc6

由 Oliver O'Halloran 提交于 7月 25, 2020

The EEH core has a concept of a "PE tree" to support PowerNV. The PE tree
follows the PCI bus structures because a reset asserted on an upstream
bridge will be propagated to the downstream bridges. On pseries there's a
1-1 correspondence between what the guest sees are a PHB and a PE so the
"tree" is really just a single node.

Current the EEH core is reponsible for setting up this PE tree which it
does by traversing the pci_dn tree. The structure of the pci_dn tree
matches the bus tree on PowerNV which leads to the PE tree being "correct"
this setup method doesn't make a whole lot of sense and it's actively
confusing for the pseries case where it doesn't really do anything.

We want to remove the dependence on pci_dn anyway so this patch move
choosing where to insert a new PE into the platform code rather than
being part of the generic EEH code. For PowerNV this simplifies the
tree building logic and removes the use of pci_dn. For pseries we
keep the existing logic. I'm not really convinced it does anything
due to the 1-1 PE-to-PHB correspondence so every device under that
PHB should be in the same PE, but I'd rather not remove it entirely
until we've had a chance to look at it more deeply.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200725081231.39076-14-oohall@gmail.com

a131bfc6

powerpc/eeh: Rename eeh_{add_to|remove_from}_parent_pe() · d923ab7a

由 Oliver O'Halloran 提交于 7月 25, 2020

The naming of eeh_{add_to|remove_from}_parent_pe() doesn't really reflect
what they actually do. If the PE referred to be edev->pe_config_addr
already exists under that PHB then the edev is added to that PE. However,
if the PE doesn't exist the a new one is created for the edev.

The bulk of the implementation of eeh_add_to_parent_pe() covers that
second case. Similarly, most of eeh_remove_from_parent_pe() is
determining when it's safe to delete a PE.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200725081231.39076-12-oohall@gmail.com

d923ab7a

powerpc/eeh: Remove class code field from edev · 768a4284

由 Oliver O'Halloran 提交于 7月 25, 2020

The edev->class_code field is never referenced anywhere except for the
platform specific probe functions. The same information is available in
the pci_dev for PowerNV and in the pci_dn on pseries so we can remove
the field.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200725081231.39076-11-oohall@gmail.com

768a4284

powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config() · 17d2a487

由 Oliver O'Halloran 提交于 7月 25, 2020

Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference
a specific device rather than pci_dn. No functional changes.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200725081231.39076-9-oohall@gmail.com

17d2a487

powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config() · 0c2c7652

由 Oliver O'Halloran 提交于 7月 25, 2020

0c2c7652

powerpc/eeh: Remove VF config space restoration · 21b43bd5

由 Oliver O'Halloran 提交于 7月 25, 2020

There's a bunch of strange things about this code. First up is that none of
the fields being written to are functional for a VF. The SR-IOV
specification lists then as "Reserved, but OS should preserve" so writing
new values to them doesn't do anything and is clearly wrong from a
correctness perspective.

However, since VFs are designed to be managed by the OS there is an
argument to be made that we should be saving and restoring some parts of
config space. We already sort of do that by saving the first 64 bytes of
config space in the eeh_dev (see eeh_dev->config_space[]). This is
inadequate since it doesn't even consider saving and restoring the PCI
capability structures. However, this is a problem with EEH in general and
that needs to be fixed for non-VF devices too.

There's no real reason to keep around this around so delete it.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200725081231.39076-6-oohall@gmail.com

21b43bd5

powerpc/eeh: Kill off eeh_ops->get_pe_addr() · a40db934

由 Oliver O'Halloran 提交于 7月 25, 2020

This is used in precisely one place which is in pseries specific platform
code.  There's no need to have the callback in eeh_ops since the platform
chooses the EEH PE addresses anyway. The PowerNV implementation has always
been a stub too so remove it.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200725081231.39076-5-oohall@gmail.com

a40db934

23 7月, 2020 3 次提交

powerpc/powernv/idle: Exclude mfspr on HID1, 4, 5 on P9 and above · 5c92fb1b

由 Pratik Rajesh Sampat 提交于 7月 21, 2020

POWER9 onwards the support for the registers HID1, HID4, HID5 has been
receded.
Although mfspr on the above registers worked in Power9, In Power10
simulator is unrecognized. Moving their assignment under the
check for machines lower than Power9
Signed-off-by: NPratik Rajesh Sampat <psampat@linux.ibm.com>
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200721153708.89057-4-psampat@linux.ibm.com

5c92fb1b

powerpc/powernv/idle: Rename pnv_first_spr_loss_level variable · dcbbfa6b

由 Pratik Rajesh Sampat 提交于 7月 21, 2020

Replace the variable name from using "pnv_first_spr_loss_level" to
"deep_spr_loss_state".

pnv_first_spr_loss_level is supposed to be the earliest state that
has OPAL_PM_LOSE_FULL_CONTEXT set, in other places the kernel uses the
"deep" states as terminology. Hence renaming the variable to be coherent
to its semantics.
Signed-off-by: NPratik Rajesh Sampat <psampat@linux.ibm.com>
Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200721153708.89057-3-psampat@linux.ibm.com

dcbbfa6b

powerpc/powernv/idle: Replace CPU feature check with PVR check · 8747bf36

由 Pratik Rajesh Sampat 提交于 7月 21, 2020

The POWER9 idle driver contains implementation-specific details that
means it is not suitable to run on any processor that implements ISA
v3.0 (e.g., POWER10), so only init the driver when running on a
POWER9.
Signed-off-by: NPratik Rajesh Sampat <psampat@linux.ibm.com>
[mpe: Use updated change log from Nick]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200721153708.89057-2-psampat@linux.ibm.com

8747bf36

22 7月, 2020 1 次提交

powerpc/perf: BHRB control to disable BHRB logic when not used · 1cade527

由 Athira Rajeev 提交于 7月 17, 2020

PowerISA v3.1 has few updates for the Branch History Rolling
Buffer(BHRB).

BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
bit, namely "BHRB Recording Disable (BHRBRD)". This field controls
whether BHRB entries are written when BHRB recording is enabled by
other bits. This patch implements support for this BHRB disable bit.
By setting 0b1 to this bit will disable the BHRB and by setting 0b0 to
this bit will have BHRB enabled. This addresses backward
compatibility (for older OS), since this bit will be cleared and
hardware will be writing to BHRB by default.

This patch addresses changes to set MMCRA (BHRBRD) at boot for
power10 (there by the core will run faster) and enable this feature
only on runtime ie, on explicit need from user. Also save/restore
MMCRA in the restore path of state-loss idle state to make sure we
keep BHRB disabled if it was not enabled on request at runtime.
Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-12-git-send-email-atrajeev@linux.vnet.ibm.com

1cade527

20 7月, 2020 1 次提交

powerpc/mm/radix: Create separate mappings for hot-plugged memory · af9d00e9

由 Aneesh Kumar K.V 提交于 7月 09, 2020

To enable memory unplug without splitting kernel page table
mapping, we force the max mapping size to the LMB size. LMB
size is the unit in which hypervisor will do memory add/remove
operation.

Pseries systems supports max LMB size of 256MB. Hence on pseries,
we now end up mapping memory with 2M page size instead of 1G. To improve
that we want hypervisor to hint the kernel about the hotplug
memory range. That was added that as part of

commit b6eca183 ("powerpc/kernel: Enables memory
hot-remove after reboot on pseries guests")

But PowerVM doesn't provide that hint yet. Once we get PowerVM
updated, we can then force the 2M mapping only to hot-pluggable
memory region using memblock_is_hotpluggable(). Till then
let's depend on LMB size for finding the mapping page size
for linear range.

With this change KVM guest will also be doing linear mapping with
2M page size.

The actual TLB benefit of mapping guest page table entries with
hugepage size can only be materialized if the partition scoped
entries are also using the same or higher page size. A guest using
1G hugetlbfs backing guest memory can have a performance impact with
the above change.
Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Fold in fix from Aneesh spotted by lkp@intel.com]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200709131925.922266-5-aneesh.kumar@linux.ibm.com

af9d00e9

15 7月, 2020 3 次提交

powerpc/vas: Report proper error code for address translation failure · 6068e1a4

由 Haren Myneni 提交于 7月 10, 2020

P9 DD2 NX workbook (Table 4-36) says DMA controller uses CC=5
internally for translation fault handling. NX reserves CC=250 for
OS to notify user space when NX encounters address translation
failure on the request buffer. Not an issue in earlier releases
as NX does not get faults on kernel addresses.

This patch defines CSB_CC_FAULT_ADDRESS(250) and updates CSB.CC with
this proper error code for user space.

Fixes: c96c4436 ("powerpc/vas: Update CSB and notify process for fault CRBs")
Signed-off-by: NHaren Myneni <haren@linux.ibm.com>
[mpe: Added Fixes tag and fix typo in comment]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/019fd53e7538c6f8f332d175df74b1815ef5aa8c.camel@linux.ibm.com

6068e1a4

powerpc/powernv: Move pnv_ioda_setup_bus_dma under CONFIG_IOMMU_API · e3417fae

由 Oliver O'Halloran 提交于 7月 05, 2020

pnv_ioda_setup_bus_dma() is only used when a passed through PE is
returned to the host. If the kernel is built without IOMMU support
this is dead code. Move it under the #ifdef with the rest of the
IOMMU API support.
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200705133557.443607-2-oohall@gmail.com

e3417fae

powerpc/powernv: Make pnv_pci_sriov_enable() and friends static · 93eacd94

由 Oliver O'Halloran 提交于 7月 05, 2020

The kernel test robot noticed these are non-static which causes Clang to
print some warnings. These are called via ppc_md function pointers so
there's no need for them to be non-static.
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200705133557.443607-1-oohall@gmail.com

93eacd94

22 6月, 2020 1 次提交

powerpc/powernv/ioda: Return correct error if TCE level allocation failed · 5f202c1a

由 Alexey Kardashevskiy 提交于 6月 17, 2020

The iommu_table_ops::xchg_no_kill() callback updates TCE. It is quite
possible that not entire table is allocated if it is huge and multilevel
so xchg may also allocate subtables. If failed, it returns H_HARDWARE
for failed allocation and H_TOO_HARD if it needs it but cannot do because
the alloc parameter is "false" (set when called with MMU=off to force
retry with MMU=on).

The problem is that having separate errors only matters in real mode
(MMU=off) but the only caller with alloc="false" does not check the exact
error code and simply returns H_TOO_HARD; and for every other mode
alloc is "true". Also, the function is also called from the ioctl()
handler of the VFIO SPAPR TCE IOMMU subdriver which does not expect
hypervisor error codes (H_xxx) and will expose them to the userspace.

This converts wrong error codes to -ENOMEM.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200617003835.48831-1-aik@ozlabs.ru

5f202c1a

11 6月, 2020 1 次提交

kernel: better document the use_mm/unuse_mm API contract · f5678e7f

由 Christoph Hellwig 提交于 6月 10, 2020

Switch the function documentation to kerneldoc comments, and add
WARN_ON_ONCE asserts that the calling thread is a kernel thread and does
not have ->mm set (or has ->mm set in the case of unuse_mm).

Also give the functions a kthread_ prefix to better document the use case.

[hch@lst.de: fix a comment typo, cover the newly merged use_mm/unuse_mm caller in vfio]
  Link: http://lkml.kernel.org/r/20200416053158.586887-3-hch@lst.de
[sfr@canb.auug.org.au: powerpc/vas: fix up for {un}use_mm() rename]
  Link: http://lkml.kernel.org/r/20200422163935.5aa93ba5@canb.auug.org.auSigned-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NJens Axboe <axboe@kernel.dk>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [usb]
Acked-by: NHaren Myneni <haren@linux.ibm.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Link: http://lkml.kernel.org/r/20200404094101.672954-6-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5678e7f

28 5月, 2020 2 次提交

powerpc/powernv/pci: Sprinkle around some WARN_ON()s · 6ae8aedf

由 Oliver O'Halloran 提交于 4月 17, 2020

pnv_pci_ioda_configure_bus() should now only ever be called when a device is
added to the bus so add a WARN_ON() to the empty bus check. Similarly,
pnv_pci_ioda_setup_bus_PE() should only ever be called for an unconfigured PE,
so add a WARN_ON() for that case too.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200417073508.30356-5-oohall@gmail.com

6ae8aedf

powerpc/powernv/pci: Reserve the root bus PE during init · 718d249a

由 Oliver O'Halloran 提交于 4月 17, 2020

Doing it once during boot rather than doing it on the fly and drop the janky
populated logic.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200417073508.30356-4-oohall@gmail.com

718d249a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功