提交 · aa66cec415525d668f765da46c32b8ebe9c7a142 · openeuler / Kernel

14 1月, 2022 1 次提交

irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL · aa66cec4

由 Wudi Wang 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.85
commit cd946f0ebe787068fd8070e06249706b29e86923
bugzilla: 186032 https://gitee.com/openeuler/kernel/issues/I4QVI4

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=cd946f0ebe787068fd8070e06249706b29e86923

--------------------------------

commit b383a42c upstream.

INVALL CMD specifies that the ITS must ensure any caching associated with
the interrupt collection defined by ICID is consistent with the LPI
configuration tables held in memory for all Redistributors. SYNC is
required to ensure that INVALL is executed.

Currently, LPI configuration data may be inconsistent with that in the
memory within a short period of time after the INVALL command is executed.
Signed-off-by: NWudi Wang <wangwudi@hisilicon.com>
Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Fixes: cc2d3216 ("irqchip: GICv3: ITS command queue")
Link: https://lore.kernel.org/r/20211208015429.5007-1-zhangshaokun@hisilicon.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

aa66cec4

15 11月, 2021 1 次提交

irqchip/gic-v3-its: Fix potential VPE leak on error · 9faeac25

由 Kaige Fu 提交于 11月 15, 2021

stable inclusion
from stable-5.10.70
commit 568662e37f927e3dc3e475f3ff7cf4ab7719c5e7
bugzilla: 182949 https://gitee.com/openeuler/kernel/issues/I4I3GQ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=568662e37f927e3dc3e475f3ff7cf4ab7719c5e7

--------------------------------

[ Upstream commit 280bef51 ]

In its_vpe_irq_domain_alloc, when its_vpe_init() returns an error,
there is an off-by-one in the number of VPEs to be freed.

Fix it by simply passing the number of VPEs allocated, which is the
index of the loop iterating over the VPEs.

Fixes: 7d75bbb4 ("irqchip/gic-v3-its: Add VPE irq domain allocation/teardown")
Signed-off-by: NKaige Fu <kaige.fu@linux.alibaba.com>
[maz: fixed commit message]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/d9e36dee512e63670287ed9eff884a5d8d6d27f2.1631672311.git.kaige.fu@linux.alibaba.comSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9faeac25

16 7月, 2021 4 次提交

irqchip/gic-v3-its: Drop the setting of PTZ altogether · 51b026ba

由 Shenming Lu 提交于 7月 14, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZSU3
CVE: NA

---------------------------

GICv4.1 gives a way to get the VLPI state, which needs to map the
vPE first, and after the state read, we may remap the vPE back while
the VPT is not empty. So we can't assume that the VPT is empty at
the first map. Besides, the optimization of PTZ is probably limited
since the HW should be fairly efficient to parse the empty VPT. Let's
drop the setting of PTZ altogether.
Signed-off-by: NShenming Lu <lushenming@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210322060158.1584-3-lushenming@huawei.comReviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

51b026ba

irqchip/gic-v3-its: Add a cache invalidation right after vPE unmapping · b765f2c5

由 Marc Zyngier 提交于 7月 14, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZSU3
CVE: NA

---------------------------

In order to be able to manipulate the VPT once a vPE has been
unmapped, perform the required CMO to invalidate the CPU view
of the VPT.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NShenming Lu <lushenming@huawei.com>
Link: https://lore.kernel.org/r/20210322060158.1584-2-lushenming@huawei.comReviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b765f2c5

irqchip/gic-v4.1: Reduce the delay when polling GICR_VPENDBASER.Dirty · 319e568b

由 Shenming Lu 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I40TDK
CVE: NA

---------------------------

The 10us delay of the poll on the GICR_VPENDBASER.Dirty bit is too
high, which might greatly affect the total scheduling latency of a
vCPU in our measurement. So we reduce it to 1 to lessen the impact.
Signed-off-by: NShenming Lu <lushenming@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201128141857.983-2-lushenming@huawei.comReviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

319e568b

KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit · 19339c51

由 Shenming Lu 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I40TDK
CVE: NA

---------------------------

In order to reduce the impact of the VPT parsing happening on the GIC,
we can split the vcpu reseidency in two phases:

- programming GICR_VPENDBASER: this still happens in vcpu_load()
- checking for the VPT parsing to be complete: this can happen
  on vcpu entry (in kvm_vgic_flush_hwstate())

This allows the GIC and the CPU to work in parallel, rewmoving some
of the entry overhead.
Suggested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NShenming Lu <lushenming@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201128141857.983-3-lushenming@huawei.comReviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

19339c51

22 11月, 2020 1 次提交

irqchip/gic-v3-its: Unconditionally save/restore the ITS state on suspend · 74cde1a5

由 Xu Qiang 提交于 11月 07, 2020

On systems without HW-based collections (i.e. anything except GIC-500),
we rely on firmware to perform the ITS save/restore. This doesn't
really work, as although FW can properly save everything, it cannot
fully restore the state of the command queue (the read-side is reset
to the head of the queue). This results in the ITS consuming previously
processed commands, potentially corrupting the state.

Instead, let's always save the ITS state on suspend, disabling it in the
process, and restore the full state on resume. This saves us from broken
FW as long as it doesn't enable the ITS by itself (for which we can't do
anything).

This amounts to simply dropping the ITS_FLAGS_SAVE_SUSPEND_STATE.
Signed-off-by: NXu Qiang <xuqiang36@huawei.com>
[maz: added warning on resume, rewrote commit message]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201107104226.14282-1-xuqiang36@huawei.com

74cde1a5

14 10月, 2020 1 次提交

memblock: implement for_each_reserved_mem_region() using __next_mem_region() · 9f3d5eaa

由 Mike Rapoport 提交于 10月 13, 2020

Iteration over memblock.reserved with for_each_reserved_mem_region() used
__next_reserved_mem_region() that implemented a subset of
__next_mem_region().

Use __for_each_mem_range() and, essentially, __next_mem_region() with
appropriate parameters to reduce code duplication.

While on it, rename for_each_reserved_mem_region() to
for_each_reserved_mem_range() for consistency.
Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-17-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9f3d5eaa

24 9月, 2020 1 次提交

irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory · 95ac5bf4

由 Jonathan Cameron 提交于 8月 18, 2020

Note this crash is present before any of the patches in this series, but
as explained below it is highly unlikely anyone is shipping a firmware that
causes it. Tests were done using an overriden SRAT.

On ARM64, the gic-v3 driver directly parses SRAT to locate GIC Interrupt
Translation Service (ITS) Affinity Structures. This is done much later
in the boot than the parses of SRAT which identify proximity domains.

As a result, an ITS placed in a proximity domain that is not defined by
another SRAT structure will result in a NUMA node that is not completely
configured and a crash.

ITS [mem 0x202100000-0x20211ffff]
ITS@0x0000000202100000: Using ITS number 0
Unable to handle kernel paging request at virtual address 0000000000001a08
...

Call trace:
  __alloc_pages_nodemask+0xe8/0x338
  alloc_pages_node.constprop.0+0x34/0x40
  its_probe_one+0x2f8/0xb18
  gic_acpi_parse_madt_its+0x108/0x150
  acpi_table_parse_entries_array+0x17c/0x264
  acpi_table_parse_entries+0x48/0x6c
  acpi_table_parse_madt+0x30/0x3c
  its_init+0x1c4/0x644
  gic_init_bases+0x4b8/0x4ec
  gic_acpi_init+0x134/0x264
  acpi_match_madt+0x4c/0x84
  acpi_table_parse_entries_array+0x17c/0x264
  acpi_table_parse_entries+0x48/0x6c
  acpi_table_parse_madt+0x30/0x3c
  __acpi_probe_device_table+0x8c/0xe8
  irqchip_init+0x3c/0x48
  init_IRQ+0xcc/0x100
  start_kernel+0x33c/0x548

ACPI 6.3 allows any set of Affinity Structures in SRAT to define a proximity
domain.  However, as we do not see this crash, we can conclude that no
firmware is currently placing an ITS in a node that is separate from
those containing memory and / or processors.

We could modify the SRAT parsing behavior to identify the existence
of Proximity Domains unique to the ITS structures, and handle them as
a special case of a generic initiator (once support for those merges).

This patch avoids the complexity that would be needed to handle this corner
case, by not allowing the ITS entry parsing code to instantiate new NUMA
Nodes.  If one is encountered that does not already exist, then NO_NUMA_NODE
is assigned and a warning printed just as if the value had been greater than
allowed NUMA Nodes.

"SRAT: Invalid NUMA node -1 in ITS affinity"

Whilst this does not provide the full flexibility allowed by ACPI,
it does fix the problem.  We can revisit a more sophisticated solution if
needed by future platforms.

Change is simply to replace acpi_map_pxm_to_node with pxm_to_node reflecting
the fact a new mapping is not created.
Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

95ac5bf4

07 9月, 2020 1 次提交

irqchip/git-v3-its: Implement irq_retrigger callback for device-triggered LPIs · 5f774f5e

由 Marc Zyngier 提交于 7月 31, 2020

It is pretty easy to provide a retrigger callback for the ITS,
as it we already have the required support in terms of
irq_set_irqchip_state().

Note that this only works for device-generated LPIs, and not
the GICv4 doorbells, which should never have to be retriggered
anyway.
Reviewed-by: NValentin Schneider <valentin.schneider@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

5f774f5e

24 8月, 2020 1 次提交

treewide: Use fallthrough pseudo-keyword · df561f66

由 Gustavo A. R. Silva 提交于 8月 23, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

df561f66

27 7月, 2020 3 次提交

genirq/affinity: Make affinity setting if activated opt-in · f0c7baca

由 Thomas Gleixner 提交于 7月 24, 2020

John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:

 "It looks like what happens is that because the interrupts are not per-CPU
  in the hardware, armpmu_request_irq() calls irq_force_affinity() while
  the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
  IRQF_NOBALANCING.  

  Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
  irq_setup_affinity() which returns early because IRQF_PERCPU and
  IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."

This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.

Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...

Fixes: baedb87d ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: NJohn Keeping <john@metanate.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de

f0c7baca

irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table() · d1bd7e0b

由 Zenghui Yu 提交于 6月 30, 2020

Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled
box, I get the following kernel splat:

[    0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567
[    0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
[    0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23
[    0.053770] Call trace:
[    0.053774]  dump_backtrace+0x0/0x218
[    0.053775]  show_stack+0x2c/0x38
[    0.053777]  dump_stack+0xc4/0x10c
[    0.053779]  ___might_sleep+0xfc/0x140
[    0.053780]  __might_sleep+0x58/0x90
[    0.053782]  slab_pre_alloc_hook+0x7c/0x90
[    0.053783]  kmem_cache_alloc_trace+0x60/0x2f0
[    0.053785]  its_cpu_init+0x6f4/0xe40
[    0.053786]  gic_starting_cpu+0x24/0x38
[    0.053788]  cpuhp_invoke_callback+0xa0/0x710
[    0.053789]  notify_cpu_starting+0xcc/0xd8
[    0.053790]  secondary_start_kernel+0x148/0x200

 # ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40
its_cpu_init+0x6f4/0xe40:
allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818
(inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138
(inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166

It turned out that we're allocating memory using GFP_KERNEL (may sleep)
within the CPU hotplug notifier, which is indeed an atomic context. Bad
thing may happen if we're playing on a system with more than a single
CommonLPIAff group. Avoid it by turning this into an atomic allocation.

Fixes: 5e516846 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation")
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200630133746.816-1-yuzenghui@huawei.com

d1bd7e0b

irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR · 3af9571c

由 Zenghui Yu 提交于 7月 20, 2020

The GICv4.1 spec tells us that it's CONSTRAINED UNPREDICTABLE to issue a
register-based invalidation operation for a vPEID not mapped to that RD,
or another RD within the same CommonLPIAff group.

To follow this rule, commit f3a05921 ("irqchip/gic-v4.1: Ensure mutual
exclusion between vPE affinity change and RD access") tried to address the
race between the RD accesses and the vPE affinity change, but somehow
forgot to take GICR_INVALLR into account. Let's take the vpe_lock before
evaluating vpe->col_idx to fix it.

Fixes: f3a05921 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access")
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200720092328.708-1-yuzenghui@huawei.com

3af9571c

23 6月, 2020 1 次提交

KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbell · a3f574cd

由 Marc Zyngier 提交于 6月 23, 2020

When making a vPE non-resident because it has hit a blocking WFI,
the doorbell can fire at any time after the write to the RD.
Crucially, it can fire right between the write to GICR_VPENDBASER
and the write to the pending_last field in the its_vpe structure.

This means that we would overwrite pending_last with stale data,
and potentially not wakeup until some unrelated event (such as
a timer interrupt) puts the vPE back on the CPU.

GICv4 isn't affected by this as we actively mask the doorbell on
entering the guest, while GICv4.1 automatically manages doorbell
delivery without any hypervisor-driven masking.

Use the vpe_lock to synchronize such update, which solves the
problem altogether.

Fixes: ae699ad3 ("irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer")
Reported-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

a3f574cd

21 6月, 2020 1 次提交

irqchip/gic-v4.1: Use readx_poll_timeout_atomic() to fix sleep in atomic · 31dbb6b1

由 Zenghui Yu 提交于 6月 05, 2020

readx_poll_timeout() can sleep if @sleep_us is specified by the caller,
and is therefore unsafe to be used inside the atomic context, which is
this case when we use it to poll the GICR_VPENDBASER.Dirty bit in
irq_set_vcpu_affinity() callback.

Let's convert to its atomic version instead which helps to get the v4.1
board back to life!

Fixes: 96806229 ("irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling")
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200605052345.1494-1-yuzenghui@huawei.com

31dbb6b1

20 5月, 2020 1 次提交

irqchip/gic-v3-its: Balance initial LPI affinity across CPUs · c5d6082d

由 Marc Zyngier 提交于 5月 15, 2020

When mapping a LPI, the ITS driver picks the first possible
affinity, which is in most cases CPU0, assuming that if
that's not suitable, someone will come and set the affinity
to something more interesting.

It apparently isn't the case, and people complain of poor
performance when many interrupts are glued to the same CPU.
So let's place the interrupts by finding the "least loaded"
CPU (that is, the one that has the fewer LPIs mapped to it).
So called 'managed' interrupts are an interesting case where
the affinity is actually dictated by the kernel itself, and
we should honor this.
Reported-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Tested-by: NJohn Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1575642904-58295-1-git-send-email-john.garry@huawei.com
Link: https://lore.kernel.org/r/20200515165752.121296-3-maz@kernel.org

c5d6082d

18 5月, 2020 1 次提交

irqchip/gic-v3-its: Track LPI distribution on a per CPU basis · 2f13ff1d

由 Marc Zyngier 提交于 5月 15, 2020

In order to improve the distribution of LPIs among CPUs, let start by
tracking the number of LPIs assigned to CPUs, both for managed and
non-managed interrupts (as separate counters).
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Tested-by: NJohn Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20200515165752.121296-2-maz@kernel.org

2f13ff1d

16 4月, 2020 2 次提交

irqchip/gic-v4.1: Update effective affinity of virtual SGIs · 4b2dfe1e

由 Marc Zyngier 提交于 4月 10, 2020

Although the vSGIs are not directly visible to the host, they still
get moved around by the CPU hotplug, for example. This results in
the kernel moaning on the console, such as:

  genirq: irq_chip GICv4.1-sgi did not update eff. affinity mask of irq 38

Updating the effective affinity on set_affinity() fixes it.
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

4b2dfe1e

irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling · 96806229

由 Marc Zyngier 提交于 4月 10, 2020

When a vPE is made resident, the GIC starts parsing the virtual pending
table to deliver pending interrupts. This takes place asynchronously,
and can at times take a long while. Long enough that the vcpu enters
the guest and hits WFI before any interrupt has been signaled yet.
The vcpu then exits, blocks, and now gets a doorbell. Rince, repeat.

In order to avoid the above, a (optional on GICv4, mandatory on v4.1)
feature allows the GIC to feedback to the hypervisor whether it is
done parsing the VPT by clearing the GICR_VPENDBASER.Dirty bit.
The hypervisor can then wait until the GIC is ready before actually
running the vPE.

Plug the detection code as well as polling on vPE schedule. While
at it, tidy-up the kernel message that displays the GICv4 optional
features.
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

96806229

24 3月, 2020 6 次提交

irqchip/gic-v4.1: Eagerly vmap vPEs · 009384b3

由 Marc Zyngier 提交于 3月 04, 2020

Now that we have HW-accelerated SGIs being delivered to VPEs, it
becomes required to map the VPEs on all ITSs instead of relying
on the lazy approach that we would use when using the ITS-list
mechanism.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-17-maz@kernel.org

009384b3

irqchip/gic-v4.1: Plumb set_vcpu_affinity SGI callbacks · 05d32df1

由 Marc Zyngier 提交于 3月 04, 2020

Just like for vLPIs, there is some configuration information that cannot
be directly communicated through the normal irqchip API, and we have to
use our good old friend set_vcpu_affinity as a side-band communication
mechanism.

This is used to configure group and priority for a given vSGI.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-13-maz@kernel.org

05d32df1

irqchip/gic-v4.1: Plumb get/set_irqchip_state SGI callbacks · 7017ff0e

由 Marc Zyngier 提交于 3月 04, 2020

To implement the get/set_irqchip_state callbacks (limited to the
PENDING state), we have to use a particular set of hacks:

- Reading the pending state is done by using a pair of new redistributor
  registers (GICR_VSGIR, GICR_VSGIPENDR), which allow the 16 interrupts
  state to be retrieved.
- Setting the pending state is done by generating it as we'd otherwise do
  for a guest (writing to GITS_SGIR).
- Clearing the pending state is done by emitting a VSGI command with the
  "clear" bit set.

This requires some interesting locking though:
- When talking to the redistributor, we must make sure that the VPE
  affinity doesn't change, hence taking the VPE lock.
- At the same time, we must ensure that nobody accesses the same
  redistributor's GICR_VSGIR registers for a different VPE, which
  would corrupt the reading of the pending bits. We thus take the
  per-RD spinlock. Much fun.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-12-maz@kernel.org

7017ff0e

irqchip/gic-v4.1: Plumb mask/unmask SGI callbacks · b4e8d644

由 Marc Zyngier 提交于 3月 04, 2020

Implement mask/unmask for virtual SGIs by calling into the
configuration helper.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-11-maz@kernel.org

b4e8d644

irqchip/gic-v4.1: Add initial SGI configuration · e252cf8a

由 Marc Zyngier 提交于 3月 04, 2020

The GICv4.1 ITS has yet another new command (VSGI) which allows
a VPE-targeted SGI to be configured (or have its pending state
cleared). Add support for this command and plumb it into the
activate irqdomain callback so that it is ready to be used.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-10-maz@kernel.org

e252cf8a

irqchip/gic-v4.1: Plumb skeletal VSGI irqchip · 166cba71

由 Marc Zyngier 提交于 3月 04, 2020

Since GICv4.1 has the capability to inject 16 SGIs into each VPE,
and that I'm keen not to invent too many specific interfaces to
manipulate these interrupts, let's pretend that each of these SGIs
is an actual Linux interrupt.

For that matter, let's introduce a minimal irqchip and irqdomain
setup that will get fleshed up in the following patches.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-9-maz@kernel.org

166cba71

21 3月, 2020 5 次提交

irqchip/gic-v4: Use Inner-Shareable attributes for virtual pending tables · b2cb11f4

由 Heyi Guo 提交于 11月 30, 2019

There is no special reason to set virtual LPI pending table as
non-shareable. If we choose to hard code the shareability without
probing, Inner-Shareable is likely to be a better choice, as the
VPEs can move around and benefit from having the redistributors
snooping each other's cache, if that's something they can do.

Furthermore, Hisilicon hip08 ends up with unspecified errors when
mixing shareability attributes. So let's move to IS attributes for
the VPT. This has also been tested on D05 and didn't show any
regression.
Signed-off-by: NHeyi Guo <guoheyi@huawei.com>
[maz: rewrote commit message]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Tested-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191130073849.38378-1-guoheyi@huawei.com

b2cb11f4

irqchip/gic-v4.1: Map the ITS SGIR register page · 5e46a484

由 Marc Zyngier 提交于 3月 04, 2020

One of the new features of GICv4.1 is to allow virtual SGIs to be
directly signaled to a VPE. For that, the ITS has grown a new
64kB page containing only a single register that is used to
signal a SGI to a given VPE.

Add a second mapping covering this new 64kB range, and take this
opportunity to limit the original mapping to 64kB, which is enough
to cover the span of the ITS registers.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-8-maz@kernel.org

5e46a484

irqchip/gic-v4.1: Advertise support v4.1 to KVM · 3c40706d

由 Marc Zyngier 提交于 3月 04, 2020

Tell KVM that we support v4.1. Nothing uses this information so far.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-7-maz@kernel.org

3c40706d

irqchip/gic-v4.1: Ensure mutual exclusion betwen invalidations on the same RD · 9058a4e9

由 Marc Zyngier 提交于 3月 04, 2020

The GICv4.1 spec says that it is CONTRAINED UNPREDICTABLE to write to
any of the GICR_INV{LPI,ALL}R registers if GICR_SYNCR.Busy == 1.

To deal with it, we must ensure that only a single invalidation can
happen at a time for a given redistributor. Add a per-RD lock to that
effect and take it around the invalidation/syncr-read to deal with this.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-6-maz@kernel.org

9058a4e9

irqchip/gic-v4.1: Wait for completion of redistributor's INVALL operation · b978c25f

由 Zenghui Yu 提交于 3月 04, 2020

In GICv4.1, we emulate a guest-issued INVALL command by a direct write
to GICR_INVALLR. Before we finish the emulation and go back to guest,
let's make sure the physical invalidate operation is actually completed
and no stale data will be left in redistributor. Per the specification,
this can be achieved by polling the GICR_SYNCR.Busy bit (to zero).
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200302092145.899-1-yuzenghui@huawei.com
Link: https://lore.kernel.org/r/20200304203330.4967-5-maz@kernel.org

b978c25f

19 3月, 2020 2 次提交

irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access · f3a05921

由 Marc Zyngier 提交于 3月 04, 2020

Before GICv4.1, all operations would be serialized with the affinity
changes by virtue of using the same ITS command queue. With v4.1, things
change, as invalidations (and a number of other operations) are issued
using the redistributor MMIO frame.

We must thus make sure that these redistributor accesses cannot race
against aginst the affinity change, or we may end-up talking to the
wrong redistributor.

To ensure this, we expand the irq_to_cpuid() helper to take a spinlock
when the LPI is mapped to a vLPI (a new per-VPE lock) on each operation
that requires mutual exclusion.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200304203330.4967-4-maz@kernel.org

f3a05921

irqchip/gic-v4.1: Skip absent CPUs while iterating over redistributors · 28d160de

由 Marc Zyngier 提交于 3月 04, 2020

In a system that is only sparsly populated with CPUs, we can end-up with
redistributors structures that are not initialized. Let's make sure we
don't try and access those when iterating over them (in this case when
checking we have a L2 VPE table).

Fixes: 4e6437f1 ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level")
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NZenghui Yu <yuzenghui@huawei.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200304203330.4967-3-maz@kernel.org

28d160de

16 3月, 2020 2 次提交

irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependency · 7809f701

由 Marc Zyngier 提交于 3月 10, 2020

On a very heavily loaded D05 with GICv4, I managed to trigger the
following lockdep splat:

[ 6022.598864] ======================================================
[ 6022.605031] WARNING: possible circular locking dependency detected
[ 6022.611200] 5.6.0-rc4-00026-geee7c7b0f498 #680 Tainted: G            E
[ 6022.618061] ------------------------------------------------------
[ 6022.624227] qemu-system-aar/7569 is trying to acquire lock:
[ 6022.629789] ffff042f97606808 (&p->pi_lock){-.-.}, at: try_to_wake_up+0x54/0x7a0
[ 6022.637102]
[ 6022.637102] but task is already holding lock:
[ 6022.642921] ffff002fae424cf0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x5c/0x98
[ 6022.651350]
[ 6022.651350] which lock already depends on the new lock.
[ 6022.651350]
[ 6022.659512]
[ 6022.659512] the existing dependency chain (in reverse order) is:
[ 6022.666980]
[ 6022.666980] -> #2 (&irq_desc_lock_class){-.-.}:
[ 6022.672983]        _raw_spin_lock_irqsave+0x50/0x78
[ 6022.677848]        __irq_get_desc_lock+0x5c/0x98
[ 6022.682453]        irq_set_vcpu_affinity+0x40/0xc0
[ 6022.687236]        its_make_vpe_non_resident+0x6c/0xb8
[ 6022.692364]        vgic_v4_put+0x54/0x70
[ 6022.696273]        vgic_v3_put+0x20/0xd8
[ 6022.700183]        kvm_vgic_put+0x30/0x48
[ 6022.704182]        kvm_arch_vcpu_put+0x34/0x50
[ 6022.708614]        kvm_sched_out+0x34/0x50
[ 6022.712700]        __schedule+0x4bc/0x7f8
[ 6022.716697]        schedule+0x50/0xd8
[ 6022.720347]        kvm_arch_vcpu_ioctl_run+0x5f0/0x978
[ 6022.725473]        kvm_vcpu_ioctl+0x3d4/0x8f8
[ 6022.729820]        ksys_ioctl+0x90/0xd0
[ 6022.733642]        __arm64_sys_ioctl+0x24/0x30
[ 6022.738074]        el0_svc_common.constprop.3+0xa8/0x1e8
[ 6022.743373]        do_el0_svc+0x28/0x88
[ 6022.747198]        el0_svc+0x14/0x40
[ 6022.750761]        el0_sync_handler+0x124/0x2b8
[ 6022.755278]        el0_sync+0x140/0x180
[ 6022.759100]
[ 6022.759100] -> #1 (&rq->lock){-.-.}:
[ 6022.764143]        _raw_spin_lock+0x38/0x50
[ 6022.768314]        task_fork_fair+0x40/0x128
[ 6022.772572]        sched_fork+0xe0/0x210
[ 6022.776484]        copy_process+0x8c4/0x18d8
[ 6022.780742]        _do_fork+0x88/0x6d8
[ 6022.784478]        kernel_thread+0x64/0x88
[ 6022.788563]        rest_init+0x30/0x270
[ 6022.792390]        arch_call_rest_init+0x14/0x1c
[ 6022.796995]        start_kernel+0x498/0x4c4
[ 6022.801164]
[ 6022.801164] -> #0 (&p->pi_lock){-.-.}:
[ 6022.806382]        __lock_acquire+0xdd8/0x15c8
[ 6022.810813]        lock_acquire+0xd0/0x218
[ 6022.814896]        _raw_spin_lock_irqsave+0x50/0x78
[ 6022.819761]        try_to_wake_up+0x54/0x7a0
[ 6022.824018]        wake_up_process+0x1c/0x28
[ 6022.828276]        wakeup_softirqd+0x38/0x40
[ 6022.832533]        __tasklet_schedule_common+0xc4/0xf0
[ 6022.837658]        __tasklet_schedule+0x24/0x30
[ 6022.842176]        check_irq_resend+0xc8/0x158
[ 6022.846609]        irq_startup+0x74/0x128
[ 6022.850606]        __enable_irq+0x6c/0x78
[ 6022.854602]        enable_irq+0x54/0xa0
[ 6022.858431]        its_make_vpe_non_resident+0xa4/0xb8
[ 6022.863557]        vgic_v4_put+0x54/0x70
[ 6022.867469]        kvm_arch_vcpu_blocking+0x28/0x38
[ 6022.872336]        kvm_vcpu_block+0x48/0x490
[ 6022.876594]        kvm_handle_wfx+0x18c/0x310
[ 6022.880938]        handle_exit+0x138/0x198
[ 6022.885022]        kvm_arch_vcpu_ioctl_run+0x4d4/0x978
[ 6022.890148]        kvm_vcpu_ioctl+0x3d4/0x8f8
[ 6022.894494]        ksys_ioctl+0x90/0xd0
[ 6022.898317]        __arm64_sys_ioctl+0x24/0x30
[ 6022.902748]        el0_svc_common.constprop.3+0xa8/0x1e8
[ 6022.908046]        do_el0_svc+0x28/0x88
[ 6022.911871]        el0_svc+0x14/0x40
[ 6022.915434]        el0_sync_handler+0x124/0x2b8
[ 6022.919951]        el0_sync+0x140/0x180
[ 6022.923773]
[ 6022.923773] other info that might help us debug this:
[ 6022.923773]
[ 6022.931762] Chain exists of:
[ 6022.931762]   &p->pi_lock --> &rq->lock --> &irq_desc_lock_class
[ 6022.931762]
[ 6022.942101]  Possible unsafe locking scenario:
[ 6022.942101]
[ 6022.948007]        CPU0                    CPU1
[ 6022.952523]        ----                    ----
[ 6022.957039]   lock(&irq_desc_lock_class);
[ 6022.961036]                                lock(&rq->lock);
[ 6022.966595]                                lock(&irq_desc_lock_class);
[ 6022.973109]   lock(&p->pi_lock);
[ 6022.976324]
[ 6022.976324]  *** DEADLOCK ***

This is happening because we have a pending doorbell that requires
retrigger. As SW retriggering is done in a tasklet, we trigger the
circular dependency above.

The easy cop-out is to provide a retrigger callback that doesn't
require acquiring any extra lock.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200310184921.23552-5-maz@kernel.org

7809f701

irqchip/gic-v3-its: Probe ITS page size for all GITS_BASERn registers · d5df9dc9

由 Marc Zyngier 提交于 3月 13, 2020

The GICv3 ITS driver assumes that once it has latched on a page size for
a given BASER register, it can use the same page size as the maximum
page size for all subsequent BASER registers.

Although it worked so far, nothing in the architecture guarantees this,
and Nianyao Tang hit this problem on some undisclosed implementation.

Let's bite the bullet and probe the the supported page size on all BASER
registers before starting to populate the tables. This simplifies the
setup a bit, at the expense of a few additional MMIO accesses.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reported-by: NNianyao Tang <tangnianyao@huawei.com>
Tested-by: NNianyao Tang <tangnianyao@huawei.com>
Link: https://lore.kernel.org/r/1584089195-63897-1-git-send-email-zhangshaokun@hisilicon.com

d5df9dc9

08 3月, 2020 1 次提交

irqchip/gic-v3-its: Fix access width for gicr_syncr · 04d80dbe

由 Heyi Guo 提交于 2月 25, 2020

GICR_SYNCR is a 32bit register, so it is better to access it with
32bit access width, though we have not seen any real problem.
Signed-off-by: NHeyi Guo <guoheyi@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200225090023.28020-1-guoheyi@huawei.com

04d80dbe

10 2月, 2020 1 次提交

irqchip/gic-v4.1: Avoid 64bit division for the sake of 32bit ARM · 490d332e

由 Marc Zyngier 提交于 2月 09, 2020

In order to allow the GICv4 code to link properly on 32bit ARM,
make sure we don't use 64bit divisions when it isn't strictly
necessary.

Fixes: 4e6437f1 ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level")
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

490d332e

08 2月, 2020 3 次提交

irqchip/gic-v3-its: Rename VPENDBASER/VPROPBASER accessors · 5186a6cc

由 Zenghui Yu 提交于 2月 06, 2020

V{PEND,PROP}BASER registers are actually located in VLPI_base frame
of the *redistributor*. Rename their accessors to reflect this fact.

No functional changes.
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-7-yuzenghui@huawei.com

5186a6cc

irqchip/gic-v3-its: Remove superfluous WARN_ON · b4635325

由 Zenghui Yu 提交于 2月 06, 2020

"ITS virtual pending table not cleaning" is already complained inside
its_clear_vpend_valid(), there's no need to trigger a WARN_ON again.
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-6-yuzenghui@huawei.com

b4635325

irqchip/gic-v4.1: Drop 'tmp' in inherit_vpe_l1_table_from_rd() · 4bccf1d7

由 Zenghui Yu 提交于 2月 06, 2020

The variable 'tmp' in inherit_vpe_l1_table_from_rd() is actually
not needed, drop it.
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-5-yuzenghui@huawei.com

4bccf1d7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功