提交 · 09bb74816497ef8d8c1ff9ab51d2be14935da1f8 · openeuler / Kernel

27 12月, 2019 40 次提交

arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3 · 09bb7481

由 Will Deacon 提交于 8月 07, 2018

mainline inclusion
from mainline-v4.20-rc1
commit 8f04e8e6
category: feature
bugzilla: 20806
CVE: NA

-------------------------------------------------

On CPUs with support for PSTATE.SSBS, the kernel can toggle the SSBD
state without needing to call into firmware.

This patch hooks into the existing SSBD infrastructure so that SSBS is
used on CPUs that support it, but it's all made horribly complicated by
the very real possibility of big/little systems that don't uniformly
provide the new capability.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

Conflicts:
  arch/arm64/kernel/process.c
  arch/arm64/kernel/ssbd.c
  arch/arm64/kernel/cpufeature.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

09bb7481

arm64: ssbd: Drop #ifdefs for PR_SPEC_STORE_BYPASS · 1438602c

由 Will Deacon 提交于 6月 15, 2018

mainline inclusion
from mainline-v4.20-rc1
commit 2d1b2a91d56b19636b740ea70c8399d1df249f20
category: feature
bugzilla: 20806
CVE: NA

-------------------------------------------------

Now that we're all merged nicely into mainline, there's no need to check
to see if PR_SPEC_STORE_BYPASS is defined.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

1438602c

arm64: cpufeature: Detect SSBS and advertise to userspace · be185032

由 Will Deacon 提交于 6月 15, 2018

mainline inclusion
from mainline-v4.20-rc1
commit d71be2b6c0e19180b5f80a6d42039cc074a693a2
category: feature
bugzilla: 20806
CVE: NA

-------------------------------------------------

Armv8.5 introduces a new PSTATE bit known as Speculative Store Bypass
Safe (SSBS) which can be used as a mitigation against Spectre variant 4.

Additionally, a CPU may provide instructions to manipulate PSTATE.SSBS
directly, so that userspace can toggle the SSBS control without trapping
to the kernel.

This patch probes for the existence of SSBS and advertise the new instructions
to userspace if they exist.
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Conflicts:
  arch/arm64/kernel/cpufeature.c
  arch/arm64/include/asm/cpucaps.h
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

be185032

arm64: Fix silly typo in comment · 9f3c5929

由 Will Deacon 提交于 6月 15, 2018

mainline inclusion
from mainline-v4.20-rc1
commit ca7f686a
category: feature
bugzilla: 20806
CVE: NA

-------------------------------------------------

I was passing through and figuered I'd fix this up:

	featuer -> feature
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

9f3c5929

arm64/neon: Disable -Wincompatible-pointer-types when building with Clang · e9fabe0c

由 Nathan Chancellor 提交于 8月 31, 2019

mainline inclusion
from mainline-5.0
commit 0738c8b5
category: bugfix
bugzilla: 11024
CVE: NA

-------------------------------------------------

After commit cc9f8349 ("arm64: crypto: add NEON accelerated XOR
implementation"), Clang builds for arm64 started failing with the
following error message.

arch/arm64/lib/xor-neon.c:58:28: error: incompatible pointer types
assigning to 'const unsigned long *' from 'uint64_t *' (aka 'unsigned
long long *') [-Werror,-Wincompatible-pointer-types]
                v3 = veorq_u64(vld1q_u64(dp1 +  6), vld1q_u64(dp2 + 6));
                                         ^~~~~~~~
/usr/lib/llvm-9/lib/clang/9.0.0/include/arm_neon.h:7538:47: note:
expanded from macro 'vld1q_u64'
  __ret = (uint64x2_t) __builtin_neon_vld1q_v(__p0, 51); \
                                              ^~~~

There has been quite a bit of debate and triage that has gone into
figuring out what the proper fix is, viewable at the link below, which
is still ongoing. Ard suggested disabling this warning with Clang with a
pragma so no neon code will have this type of error. While this is not
at all an ideal solution, this build error is the only thing preventing
KernelCI from having successful arm64 defconfig and allmodconfig builds
on linux-next. Getting continuous integration running is more important
so new warnings/errors or boot failures can be caught and fixed quickly.

Link: https://github.com/ClangBuiltLinux/linux/issues/283Suggested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
(cherry picked from commit 0738c8b5)
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

e9fabe0c

arm64: crypto: add NEON accelerated XOR implementation · 2b99d865

由 Jackie Liu 提交于 8月 31, 2019

mainline inclusion
from mainline-5.0-rc1
commit: cc9f8349
category: feature
feature: NEON accelerated XOR
bugzilla: 11024
CVE: NA

--------------------------------------------------

This is a NEON acceleration method that can improve
performance by approximately 20%. I got the following
data from the centos 7.5 on Huawei's HISI1616 chip:

[ 93.837726] xor: measuring software checksum speed
[ 93.874039]   8regs  : 7123.200 MB/sec
[ 93.914038]   32regs : 7180.300 MB/sec
[ 93.954043]   arm64_neon: 9856.000 MB/sec
[ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec)

I believe this code can bring some optimization for
all arm64 platform. thanks for Ard Biesheuvel's suggestions.
Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2b99d865

arm64/neon: add workaround for ambiguous C99 stdint.h types · 28777f93

由 Jackie Liu 提交于 8月 31, 2019

mainline inclusion
from mainline-5.0-rc1
commit: 21e28547
category: feature
feature: NEON accelerated XOR
bugzilla: 11024
CVE: NA

--------------------------------------------------

In a way similar to ARM commit 09096f6a ("ARM: 7822/1: add workaround
for ambiguous C99 stdint.h types"), this patch redefines the macros that
are used in stdint.h so its definitions of uint64_t and int64_t are
compatible with those of the kernel.

This patch comes from: https://patchwork.kernel.org/patch/3540001/
Wrote by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

We mark this file as a private file and don't have to override asm/types.h
Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

28777f93

arm64/lib: improve CRC32 performance for deep pipelines · fce68364

由 Ard Biesheuvel 提交于 8月 31, 2019

mainline inclusion
from mainline-5.0
commit: efdb25efc7645b326cd5eb82be5feeabe167c24e
category: perf
bugzilla: 20886
CVE: NA

lib/crc32test result:

[root@localhost build]# rmmod crc32test && insmod lib/crc32test.ko &&
dmesg | grep cycles
[83170.153209] CPU7: use cycles 26243990
[83183.122137] CPU7: use cycles 26151290
[83309.691628] CPU7: use cycles 26122830
[83312.415559] CPU7: use cycles 26232600
[83313.191479] CPU8: use cycles 26082350

rmmod crc32test && insmod lib/crc32test.ko && dmesg | grep cycles
[ 1023.539931] CPU25: use cycles 12256730
[ 1024.850360] CPU24: use cycles 12249680
[ 1025.463622] CPU25: use cycles 12253330
[ 1025.862925] CPU25: use cycles 12269720
[ 1026.376038] CPU26: use cycles 12222480

Based on 13702:
arm64/lib: improve CRC32 performance for deep pipelines
crypto: arm64/crc32 - remove PMULL based CRC32 driver
arm64/lib: add accelerated crc32 routines
arm64: cpufeature: add feature for CRC32 instructions
lib/crc32: make core crc32() routines weak so they can be overridden

----------------------------------------------

Improve the performance of the crc32() asm routines by getting rid of
most of the branches and small sized loads on the common path.

Instead, use a branchless code path involving overlapping 16 byte
loads to process the first (length % 32) bytes, and process the
remainder using a loop that processes 32 bytes at a time.

Tested using the following test program:

  #include <stdlib.h>

  extern void crc32_le(unsigned short, char const*, int);

  int main(void)
  {
    static const char buf[4096];

    srand(20181126);

    for (int i = 0; i < 100 * 1000 * 1000; i++)
      crc32_le(0, buf, rand() % 1024);

    return 0;
  }

On Cortex-A53 and Cortex-A57, the performance regresses but only very
slightly. On Cortex-A72 however, the performance improves from

  $ time ./crc32

  real  0m10.149s
  user  0m10.149s
  sys   0m0.000s

to

  $ time ./crc32

  real  0m7.915s
  user  0m7.915s
  sys   0m0.000s

Cc: Rui Sun <sunrui26@huawei.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

fce68364

crypto: arm64/crc32 - remove PMULL based CRC32 driver · 0e584ca5

由 Ard Biesheuvel 提交于 8月 31, 2019

mainline inclusion
from mainline-4.20-rc1
commit: 598b7d41
category: feature
feature: accelerated crc32 routines
bugzilla: 13702
CVE: NA

--------------------------------------------------

Now that the scalar fallbacks have been moved out of this driver into
the core crc32()/crc32c() routines, we are left with a CRC32 crypto API
driver for arm64 that is based only on 64x64 polynomial multiplication,
which is an optional instruction in the ARMv8 architecture, and is less
and less likely to be available on cores that do not also implement the
CRC32 instructions, given that those are mandatory in the architecture
as of ARMv8.1.

Since the scalar instructions do not require the special handling that
SIMD instructions do, and since they turn out to be considerably faster
on some cores (Cortex-A53) as well, there is really no point in keeping
this code around so let's just remove it.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

0e584ca5

arm64/lib: add accelerated crc32 routines · 11fa09f2

由 Ard Biesheuvel 提交于 8月 31, 2019

mainline inclusion
from mainline-4.20-rc1
commit: 7481cddf29ede204b475facc40e6f65459939881
category: feature
feature: accelerated crc32 routines
bugzilla: 13702
CVE: NA

--------------------------------------------------

Unlike crc32c(), which is wired up to the crypto API internally so the
optimal driver is selected based on the platform's capabilities,
crc32_le() is implemented as a library function using a slice-by-8 table
based C implementation. Even though few of the call sites may be
bottlenecks, calling a time variant implementation with a non-negligible
D-cache footprint is a bit of a waste, given that ARMv8.1 and up mandates
support for the CRC32 instructions that were optional in ARMv8.0, but are
already widely available, even on the Cortex-A53 based Raspberry Pi.

So implement routines that use these instructions if available, and fall
back to the existing generic routines otherwise. The selection is based
on alternatives patching.

Note that this unconditionally selects CONFIG_CRC32 as a builtin. Since
CRC32 is relied upon by core functionality such as CONFIG_OF_FLATTREE,
this just codifies the status quo.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

11fa09f2

arm64: cpufeature: add feature for CRC32 instructions · 3a8984a9

由 Ard Biesheuvel 提交于 8月 31, 2019

mainline inclusion
from mainline-4.20-rc1
commit: 86d0dd34eafffbc76a81aba6ae2d71927d3835a8
category: feature
feature: accelerated crc32 routines
bugzilla: 13702
CVE: NA

--------------------------------------------------

Add a CRC32 feature bit and wire it up to the CPU id register so we
will be able to use alternatives patching for CRC32 operations.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

Conflicts:
	arch/arm64/include/asm/cpucaps.h
	arch/arm64/kernel/cpufeature.c
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

3a8984a9

config: enable CONFIG_KTASK in hulk_defconfig and storage_ci_defconfig · 58622cdd

由 Hongbo Yao 提交于 8月 30, 2019

hulk inclusion
category: feature
bugzilla: 13228
CVE: NA
---------------------------

enable CONFIG_KTASK in hulk_defconfig and storage_ci_defconfig
Signed-off-by: NHongbo Yao <yaohongbo@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

58622cdd

mm: change locked_vm's type from unsigned long to atomic_long_t · 53f4e528

由 Daniel Jordan 提交于 8月 14, 2019

hulk inclusion
category: feature
bugzilla: 13228
CVE: NA
---------------------------

Currently, mmap_sem must be held as writer to modify the locked_vm field
in mm_struct.

This creates a bottleneck when multithreading VFIO page pinning because
each thread holds the mmap_sem as reader for the majority of the pinning
time but also takes mmap_sem as writer regularly, for short times, when
modifying locked_vm.

The problem gets worse when other workloads compete for CPU with ktask
threads doing page pinning because the other workloads force ktask
threads that hold mmap_sem as writer off the CPU, blocking ktask threads
trying to get mmap_sem as reader for an excessively long time (the
mmap_sem reader wait time grows linearly with the thread count).

Requiring mmap_sem for locked_vm also abuses mmap_sem by making it
protect data that could be synchronized separately.

So, decouple locked_vm from mmap_sem by making locked_vm an
atomic_long_t.  locked_vm's old type was unsigned long and changing it
to a signed type makes it lose half its capacity, but that's only a
concern for 32-bit systems and LONG_MAX * PAGE_SIZE is 8T on x86 in that
case, so there's headroom.

Now that mmap_sem is not taken as writer here, ktask threads holding
mmap_sem as reader can run more often.  Performance results appear later
in the series.

On powerpc, this was cross-compiled-tested only.

[XXX Can send separately.]
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: NHongbo Yao <yaohongbo@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Tested-by: NHongbo Yao <yaohongbo@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

53f4e528

arm64: set all the CPU as present for dtb booting with 'CONFIG_ACPI' enabled · 015f6d7e

由 Xiongfeng Wang 提交于 8月 28, 2019

hulk inclusion
category: bugfix
bugzilla: 20799
CVE: NA
---------------------------

The following patch didn't consider the situation when we boot the
system using device tree but 'CONFIG_ACPI' is enabled. In this
situation, we also need to set all the CPU as present CPU.

Fixes: 280637f70ab5 ("arm64: mark all the GICC nodes in MADT as possible cpu")
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

015f6d7e

arm64/numa: Report correct memblock range for the dummy node · 65246909

由 Anshuman Khandual 提交于 7月 24, 2019

mainline inclusion
from mainline-v4.20-rc5
commit 77cfe950
category: bugfix
bugzilla: 5611
CVE: NA

The dummy node ID is marked into all memory ranges on the system. So the
dummy node really extends the entire memblock.memory. Hence report correct
extent information for the dummy node using memblock range helper functions
instead of the range [0LLU, PFN_PHYS(max_pfn) - 1)].

Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms")
Acked-by: NPunit Agrawal <punit.agrawal@arm.com>
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

65246909

bpf, arm64: fix getting subprog addr from aux for calls · f7dcdf17

由 Daniel Borkmann 提交于 7月 24, 2019

mainline inclusion
from mainline-v4.20-rc5
commit 8c11ea5c
category: bugfix
bugzilla: 5654
CVE: NA

The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF
to BPF call offset can be too far away from core kernel in that relative
encoding into imm is not sufficient and could potentially be truncated,
see also fd045f6c ("arm64: add support for module PLTs") which adds
spill-over space for module_alloc() and therefore bpf_jit_binary_alloc().
Therefore, use the recently added bpf_jit_get_func_addr() helper for
properly fetching the address through prog->aux->func[off]->bpf_func
instead. This also has the benefit to optimize normal helper calls since
their address can use the optimized emission. Tested on Cavium ThunderX
CN8890.

Fixes: db496944 ("bpf: arm64: add JIT support for multi-function programs")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f7dcdf17

bpf, ppc64: generalize fetching subprog into bpf_jit_get_func_addr · 51778776

由 Daniel Borkmann 提交于 7月 24, 2019

mainline inclusion
from mainline-v4.20-rc5
commit e2c95a61
category: bugfix
bugzilla: 5654
CVE: NA

Make fetching of the BPF call address from ppc64 JIT generic. ppc64
was using a slightly different variant rather than through the insns'
imm field encoding as the target address would not fit into that space.
Therefore, the target subprog number was encoded into the insns' offset
and fetched through fp->aux->func[off]->bpf_func instead. Given there
are other JITs with this issue and the mechanism of fetching the address
is JIT-generic, move it into the core as a helper instead. On the JIT
side, we get information on whether the retrieved address is a fixed
one, that is, not changing through JIT passes, or a dynamic one. For
the former, JITs can optimize their imm emission because this doesn't
change jump offsets throughout JIT process.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NSandipan Das <sandipan@linux.ibm.com>
Tested-by: NSandipan Das <sandipan@linux.ibm.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

51778776

pcie: enable hisi pcie dfx driver · 3cbc9ab3

由 liuyanshi 提交于 8月 23, 2019

driver inclusion
category: feature
bugzilla: NA
CVE: NA

enable hisi pcie debug driver for hiarmtool.
Signed-off-by: Nliuyanshi <liuyanshi@huawei.com>
Reviewed-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

3cbc9ab3

arm64: Add CPU hotplug support · 2875d85b

由 Xiongfeng Wang 提交于 8月 22, 2019

hulk inclusion
category: feature
bugzilla: 20208
CVE: NA
---------------------------

To support CPU hotplug, we need to implement 'acpi_(un)map_cpu()' and
'arch_(un)register_cpu()' for ARM64. These functions are called in
'acpi_processor_hotadd_init()/acpi_processor_remove()' when the CPU is hot
added into or hot removed from the system.

Note: This patch only support core hotplug and does not support socket
hotplug because we don't support live configuration of GIC.
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Acked-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2875d85b

arm64: mark all the GICC nodes in MADT as possible cpu · fbcc048a

由 Xiongfeng Wang 提交于 8月 22, 2019

hulk inclusion
category: feature
bugzilla: 20208
CVE: NA
---------------------------

We set 'cpu_possible_mask' based on the enabled GICC node in MADT. If
the GICC node is disabled, we will skip initializing the kernel data
structure for that CPU.

To support CPU hotplug, we need to initialize some CPU related data
structure in advance. This patch mark all the GICC nodes as possible CPU
and only these enabled GICC nodes as present CPU.
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Acked-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

fbcc048a

KVM: Fix leak vCPU's VMCS value into other pCPU · 867fd9a1

由 Wanpeng Li 提交于 8月 05, 2019

commit 17e433b54393a6269acbcb792da97791fe1592d8 upstream.

After commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts), a
five years old bug is exposed. Running ebizzy benchmark in three 80 vCPUs VMs
on one 80 pCPUs Skylake server, a lot of rcu_sched stall warning splatting
in the VMs after stress testing:

 INFO: rcu_sched detected stalls on CPUs/tasks: { 4 41 57 62 77} (detected by 15, t=60004 jiffies, g=899, c=898, q=15073)
 Call Trace:
   flush_tlb_mm_range+0x68/0x140
   tlb_flush_mmu.part.75+0x37/0xe0
   tlb_finish_mmu+0x55/0x60
   zap_page_range+0x142/0x190
   SyS_madvise+0x3cd/0x9c0
   system_call_fastpath+0x1c/0x21

swait_active() sustains to be true before finish_swait() is called in
kvm_vcpu_block(), voluntarily preempted vCPUs are taken into account
by kvm_vcpu_on_spin() loop greatly increases the probability condition
kvm_arch_vcpu_runnable(vcpu) is checked and can be true, when APICv
is enabled the yield-candidate vCPU's VMCS RVI field leaks(by
vmx_sync_pir_to_irr()) into spinning-on-a-taken-lock vCPU's current
VMCS.

This patch fixes it by checking conservatively a subset of events.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Marc Zyngier <Marc.Zyngier@arm.com>
Cc: stable@vger.kernel.org
Fixes: 98f4a146 (KVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop)
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

867fd9a1

x86/purgatory: Do not use __builtin_memcpy and __builtin_memset · f01cb9fc

由 Nick Desaulniers 提交于 8月 07, 2019

commit 4ce97317f41d38584fb93578e922fcd19e535f5b upstream.

Implementing memcpy and memset in terms of __builtin_memcpy and
__builtin_memset is problematic.

GCC at -O2 will replace calls to the builtins with calls to memcpy and
memset (but will generate an inline implementation at -Os).  Clang will
replace the builtins with these calls regardless of optimization level.
$ llvm-objdump -dr arch/x86/purgatory/string.o | tail

0000000000000339 memcpy:
     339: 48 b8 00 00 00 00 00 00 00 00 movabsq $0, %rax
                000000000000033b:  R_X86_64_64  memcpy
     343: ff e0                         jmpq    *%rax

0000000000000345 memset:
     345: 48 b8 00 00 00 00 00 00 00 00 movabsq $0, %rax
                0000000000000347:  R_X86_64_64  memset
     34f: ff e0

Such code results in infinite recursion at runtime. This is observed
when doing kexec.

Instead, reuse an implementation from arch/x86/boot/compressed/string.c.
This requires to implement a stub function for warn(). Also, Clang may
lower memcmp's that compare against 0 to bcmp's, so add a small definition,
too. See also: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")

Fixes: 8fc5b4d4 ("purgatory: core purgatory functionality")
Reported-by: NVaibhav Rustagi <vaibhavrustagi@google.com>
Debugged-by: NVaibhav Rustagi <vaibhavrustagi@google.com>
Debugged-by: NManoj Gupta <manojgupta@google.com>
Suggested-by: NAlistair Delva <adelva@google.com>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NVaibhav Rustagi <vaibhavrustagi@google.com>
Cc: stable@vger.kernel.org
Link: https://bugs.chromium.org/p/chromium/issues/detail?id=984056
Link: https://lkml.kernel.org/r/20190807221539.94583-1-ndesaulniers@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f01cb9fc

s390/dma: provide proper ARCH_ZONE_DMA_BITS value · f60afbe9

由 Halil Pasic 提交于 7月 24, 2019

[ Upstream commit 1a2dcff8 ]

On s390 ZONE_DMA is up to 2G, i.e. ARCH_ZONE_DMA_BITS should be 31 bits.
The current value is 24 and makes __dma_direct_alloc_pages() take a
wrong turn first (but __dma_direct_alloc_pages() recovers then).

Let's correct ARCH_ZONE_DMA_BITS value and avoid wrong turns.
Signed-off-by: NHalil Pasic <pasic@linux.ibm.com>
Reported-by: NPetr Tesarik <ptesarik@suse.cz>
Fixes: c61e9637 ("dma-direct: add support for allocation from ZONE_DMA and ZONE_DMA32")
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f60afbe9

ARM: dts: bcm: bcm47094: add missing #cells for mdio-bus-mux · bbe0fb8a

由 Arnd Bergmann 提交于 7月 22, 2019

[ Upstream commit 3a9d2569 ]

The mdio-bus-mux has no #address-cells/#size-cells property,
which causes a few dtc warnings:

arch/arm/boot/dts/bcm47094-linksys-panamera.dts:129.4-18: Warning (reg_format): /mdio-bus-mux/mdio@200:reg: property has invalid length (4 bytes) (#address-cells == 2, #size-cells == 1)
arch/arm/boot/dts/bcm47094-linksys-panamera.dtb: Warning (pci_device_bus_num): Failed prerequisite 'reg_format'
arch/arm/boot/dts/bcm47094-linksys-panamera.dtb: Warning (i2c_bus_reg): Failed prerequisite 'reg_format'
arch/arm/boot/dts/bcm47094-linksys-panamera.dtb: Warning (spi_bus_reg): Failed prerequisite 'reg_format'
arch/arm/boot/dts/bcm47094-linksys-panamera.dts:128.22-132.5: Warning (avoid_default_addr_size): /mdio-bus-mux/mdio@200: Relying on default #address-cells value
arch/arm/boot/dts/bcm47094-linksys-panamera.dts:128.22-132.5: Warning (avoid_default_addr_size): /mdio-bus-mux/mdio@200: Relying on default #size-cells value

Add the normal cell numbers.

Link: https://lore.kernel.org/r/20190722145618.1155492-1-arnd@arndb.de
Fixes: 2bebdfcd ("ARM: dts: BCM5301X: Add support for Linksys EA9500")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

bbe0fb8a

ARM: davinci: fix sleep.S build error on ARMv4 · fbaa1bfb

由 Arnd Bergmann 提交于 7月 22, 2019

[ Upstream commit d64b212e ]

When building a multiplatform kernel that includes armv4 support,
the default target CPU does not support the blx instruction,
which leads to a build failure:

arch/arm/mach-davinci/sleep.S: Assembler messages:
arch/arm/mach-davinci/sleep.S:56: Error: selected processor does not support `blx ip' in ARM mode

Add a .arch statement in the sources to make this file build.

Link: https://lore.kernel.org/r/20190722145211.1154785-1-arnd@arndb.deAcked-by: NSekhar Nori <nsekhar@ti.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

fbaa1bfb

x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS · 4967fd33

由 Nick Desaulniers 提交于 8月 07, 2019

commit b059f801a937d164e03b33c1848bb3dca67c0b04 upstream.

KBUILD_CFLAGS is very carefully built up in the top level Makefile,
particularly when cross compiling or using different build tools.
Resetting KBUILD_CFLAGS via := assignment is an antipattern.

The comment above the reset mentions that -pg is problematic. Other
Makefiles use `CFLAGS_REMOVE_file.o = $(CC_FLAGS_FTRACE)` when
CONFIG_FUNCTION_TRACER is set. Prefer that pattern to wiping out all of
the important KBUILD_CFLAGS then manually having to re-add them. Seems
also that __stack_chk_fail references are generated when using
CONFIG_STACKPROTECTOR or CONFIG_STACKPROTECTOR_STRONG.

Fixes: 8fc5b4d4 ("purgatory: core purgatory functionality")
Reported-by: NVaibhav Rustagi <vaibhavrustagi@google.com>
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NVaibhav Rustagi <vaibhavrustagi@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190807221539.94583-2-ndesaulniers@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

4967fd33

x86/mm: Sync also unmappings in vmalloc_sync_all() · 220dc4c0

由 Joerg Roedel 提交于 7月 19, 2019

commit 8e998fc24de47c55b47a887f6c95ab91acd4a720 upstream.

With huge-page ioremap areas the unmappings also need to be synced between
all page-tables. Otherwise it can cause data corruption when a region is
unmapped and later re-used.

Make the vmalloc_sync_one() function ready to sync unmappings and make sure
vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD
is found.

Fixes: 5d72b4fb ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@8bytes.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

220dc4c0

x86/mm: Check for pfn instead of page in vmalloc_sync_one() · 4a314305

由 Joerg Roedel 提交于 7月 19, 2019

commit 51b75b5b563a2637f9d8dc5bd02a31b2ff9e5ea0 upstream.

Do not require a struct page for the mapped memory location because it
might not exist. This can happen when an ioremapped region is mapped with
2MB pages.

Fixes: 5d72b4fb ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@8bytes.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

4a314305

config: disable CONFIG_ISCSI_IBFT by default · c5f7f11c

由 Yang Yingliang 提交于 8月 20, 2019

hulk inclusion
category: bugfix
bugzilla: 4979
CVE: NA

-------------------------------------------------

Disable CONFIG_ISCSI_IBFT by default.

c5f7f11c

arm_pmu: acpi: spe: Add initial MADT/SPE probing · 09436382

由 Jeremy Linton 提交于 8月 19, 2019

mainline inclusion
from mainline-5.3-rc1
commit d24a0c7099b3
category: feature
bugzilla: 16072
CVE: NA
---------------------------

ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.
Tested-by: NHanjun Gou <gouhanjun@huawei.com>
Reviewed-by: NSudeep Holla <sudeep.holla@arm.com>
Reviewed-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Signed-off-by: NHongbo Yao <yaohongbo@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

09436382

Revert "arm_pmu: acpi: spe: Add initial MADT/SPE probing" · d44538a7

由 Hongbo Yao 提交于 8月 19, 2019

hulk inclusion
category: feature
bugzilla: 16072
CVE: NA
---------------------------

This reverts commit 556b16f5ad7e910c3784bb02b33c2af6ca9c9a4b.
In Linux 5.3.0, SPE ACPI enablement has been upstreamed. SPE patches
in hulk-4.19 are the old version, and they need to be reverted
to the mainline version.
Signed-off-by: NHongbo Yao <yaohongbo@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

d44538a7

arm64: tlbflush: Ensure start/end of address range are aligned to stride · bcfdec50

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-5.2
commit: 01d57485fcdb9f9101a10a18e32d5f8b023cab86
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

Since commit 3d65b6bbc01e ("arm64: tlbi: Set MAX_TLBI_OPS to
PTRS_PER_PTE"), we resort to per-ASID invalidation when attempting to
perform more than PTRS_PER_PTE invalidation instructions in a single
call to __flush_tlb_range(). Whilst this is beneficial, the mmu_gather
code does not ensure that the end address of the range is rounded-up
to the stride when freeing intermediate page tables in pXX_free_tlb(),
which defeats our range checking.

Align the bounds passed into __flush_tlb_range().

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: NHanjun Guo <guohanjun@huawei.com>
Tested-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

bcfdec50

arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE · c19a7104

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-4.21
commit: 3d65b6bbc01ecece8142e62a8a5f1d48ba41a240
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

In order to reduce the possibility of soft lock-ups, we bound the
maximum number of TLBI operations performed by a single call to
flush_tlb_range() to an arbitrary constant of 1024.

Whilst this does the job of avoiding lock-ups, we can actually be a bit
smarter by defining this as PTRS_PER_PTE. Due to the structure of our
page tables, using PTRS_PER_PTE means that an outer loop calling
flush_tlb_range() for entire table entries will end up performing just a
single TLBI operation for each entry. As an example, mremap()ing a 1GB
range mapped using 4k pages now requires only 512 TLBI operations when
moving the page tables as opposed to 262144 operations (512*512) when
using the current threshold of 1024.

Cc: Joel Fernandes <joel@joelfernandes.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c19a7104

arm64: mm: Don't wait for completion of TLB invalidation when page aging · 0cd4b474

由 Alex Van Brunt 提交于 8月 13, 2019

mainline inclusion
from mainline-4.21
commit: 3403e56b
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

When transitioning a PTE from young to old as part of page aging, we
can avoid waiting for the TLB invalidation to complete and therefore
drop the subsequent DSB instruction. Whilst this opens up a race with
page reclaim, where a PTE in active use via a stale, young TLB entry
does not update the underlying descriptor, the worst thing that happens
is that the page is reclaimed and then immediately faulted back in.

Given that we have a DSB in our context-switch path, the window for a
spurious reclaim is fairly limited and eliding the barrier claims to
boost NVMe/SSD accesses by over 10% on some platforms.

A similar optimisation was made for x86 in commit b13b1d2d ("x86/mm:
In the PTE swapout page reclaim case clear the accessed bit instead of
flushing the TLB").
Signed-off-by: NAlex Van Brunt <avanbrunt@nvidia.com>
Signed-off-by: NAshish Mhetre <amhetre@nvidia.com>
[will: rewrote patch]
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

0cd4b474

arm64: tlb: Rewrite stale comment in asm/tlbflush.h · d45ff356

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-4.20-rc1
commit: 7f08872774eb971693ba79eeb2d4db364c9f5bfb
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

Peter Z asked me to justify the barrier usage in asm/tlbflush.h, but
actually that whole block comment needs to be rewritten.
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

d45ff356

arm64: tlb: Avoid synchronous TLBIs when freeing page tables · f51b9fe4

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-4.20-rc1
commit: ace8cb754539077ed75f3f15b77b2b51b5b7a431
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

By selecting HAVE_RCU_TABLE_INVALIDATE, we can rely on tlb_flush() being
called if we fail to batch table pages for freeing. This in turn allows
us to postpone walk-cache invalidation until tlb_finish_mmu(), which
avoids lots of unnecessary DSBs and means we can shoot down the ASID if
the range is large enough.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f51b9fe4

arm64: tlb: Adjust stride and type of TLBI according to mmu_gather · 0943e8a3

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-4.20-rc1
commit: f270ab88fdf205be1a7a46ccb61f4a343be543a2
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

Now that the core mmu_gather code keeps track of both the levels of page
table cleared and also whether or not these entries correspond to
intermediate entries, we can use this in our tlb_flush() callback to
reduce the number of invalidations we issue as well as their scope.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

0943e8a3

arm64: tlb: Remove redundant !CONFIG_HAVE_RCU_TABLE_FREE code · 37a73907

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-4.20-rc1
commit: 07212cd47efef1df971e2289a89c087a5f6f6a2b
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

If there's one thing the RCU-based table freeing doesn't need, it's more
ifdeffery.

Remove the redundant !CONFIG_HAVE_RCU_TABLE_FREE code, since this option
is unconditionally selected in our Kconfig.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

37a73907

arm64: tlbflush: Allow stride to be specified for __flush_tlb_range() · a730f628

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-4.20-rc1
commit: 67a902ac598dca056366a7342f401aa6f605072f
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

When we are unmapping intermediate page-table entries or huge pages, we
don't need to issue a TLBI instruction for every PAGE_SIZE chunk in the
VA range being unmapped.

Allow the invalidation stride to be passed to __flush_tlb_range(), and
adjust our "just nuke the ASID" heuristic to take this into account.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a730f628

arm64: tlb: Justify non-leaf invalidation in flush_tlb_range() · fe23372e

由 Will Deacon 提交于 8月 13, 2019

mainline inclusion
from mainline-4.20-rc1
commit: d8289d3a5854a2a0ae144bff106a78738fe63050
category: feature
feature: Reduce synchronous TLB invalidation on ARM64
bugzilla: NA
CVE: NA

--------------------------------------------------

Add a comment to explain why we can't get away with last-level
invalidation in flush_tlb_range()
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

fe23372e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功