- 01 6月, 2018 5 次提交
-
-
由 Marc Zyngier 提交于
We're about to need the mitigation state in various parts of the kernel in order to do the right thing for userspace and guests. Let's expose an accessor that will let other subsystems know about the state. Reviewed-by: NJulien Grall <julien.grall@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Marc Zyngier 提交于
On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: NJulien Grall <julien.grall@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Marc Zyngier 提交于
As for Spectre variant-2, we rely on SMCCC 1.1 to provide the discovery mechanism for detecting the SSBD mitigation. A new capability is also allocated for that purpose, and a config option. Reviewed-by: NJulien Grall <julien.grall@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Marc Zyngier 提交于
In a heterogeneous system, we can end up with both affected and unaffected CPUs. Let's check their status before calling into the firmware. Reviewed-by: NJulien Grall <julien.grall@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Marc Zyngier 提交于
In order for the kernel to protect itself, let's call the SSBD mitigation implemented by the higher exception level (either hypervisor or firmware) on each transition between userspace and kernel. We must take the PSCI conduit into account in order to target the right exception level, hence the introduction of a runtime patching callback. Reviewed-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NJulien Grall <julien.grall@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 23 5月, 2018 3 次提交
-
-
由 Mark Rutland 提交于
In do_page_fault(), we handle some kernel faults early, and simply die() with a message. For faults handled later, we dump the faulting address, decode the ESR, walk the page tables, and perform a number of steps to ensure that this data is reported. Let's unify the handling of fatal kernel faults with a new die_kernel_fault() helper, handling all of these details. This is largely the same as the existing logic in __do_kernel_fault(), except that addresses are consistently padded to 16 hex characters, as would be expected for a 64-bit address. The messages currently logged in do_page_fault are adjusted to fit into the die_kernel_fault() message template. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Rutland 提交于
The naming of is_permission_fault() makes it sound like it should return true for permission faults from EL0, but by design, it only does so for faults from EL1. Let's make this clear by dropping el1 in the name, as we do for is_el1_instruction_abort(). Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
Now that we're seeing CPUs shipping with LSE atomics, default them to 'on' in Kconfig. CPUs without the instructions will continue to use LDXR/STXR-based sequences, but they will be placed out-of-line by the compiler. Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 18 5月, 2018 8 次提交
-
-
由 Dave Martin 提交于
Writes to ZCR_EL1 are self-synchronising, and so may be expensive in typical implementations. This patch adopts the approach used for costly system register writes elsewhere in the kernel: the system register write is suppressed if it would not change the stored value. Since the common case will be that of switching between tasks that use the same vector length as one another, prediction hit rates on the conditional branch should be reasonably good, with lower expected amortised cost than the unconditional execution of a heavyweight self-synchronising instruction. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jeremy Linton 提交于
Now that we have an accurate view of the physical topology we need to represent it correctly to the scheduler. Generally MC should equal the LLC in the system, but there are a number of special cases that need to be dealt with. In the case of NUMA in socket, we need to assure that the sched domain we build for the MC layer isn't larger than the DIE above it. Similarly for LLC's that might exist in cross socket interconnect or directory hardware we need to assure that MC is shrunk to the socket or NUMA node. This patch builds a sibling mask for the LLC, and then picks the smallest of LLC, socket siblings, or NUMA node siblings, which gives us the behavior described above. This is ever so slightly different than the similar alternative where we look for a cache layer less than or equal to the socket/NUMA siblings. The logic to pick the MC layer affects all arm64 machines, but only changes the behavior for DT/MPIDR systems if the NUMA domain is smaller than the core siblings (generally set to the cluster). Potentially this fixes a possible bug in DT systems, but really it only affects ACPI systems where the core siblings is correctly set to the socket siblings. Thus all currently available ACPI systems should have MC equal to LLC, including the NUMA in socket machines where the LLC is partitioned between the NUMA nodes. Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: NVijaya Kumar K <vkilari@codeaurora.org> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: NTomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NMorten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jeremy Linton 提交于
Propagate the topology information from the PPTT tree to the cpu_topology array. We can get the thread id and core_id by assuming certain levels of the PPTT tree correspond to those concepts. The package_id is flagged in the tree and can be found by calling find_acpi_cpu_topology_package() which terminates its search when it finds an ACPI node flagged as the physical package. If the tree doesn't contain enough levels to represent all of the requested levels then the root node will be returned for all subsequent levels. Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: NVijaya Kumar K <vkilari@codeaurora.org> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: NTomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NMorten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jeremy Linton 提交于
The cluster concept isn't architecturally defined for arm64. Lets match the name of the arm64 topology field to the kernel macro that uses it. Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: NVijaya Kumar K <vkilari@codeaurora.org> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: NTomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NMorten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jeremy Linton 提交于
The /sys cache entries should support ACPI/PPTT generated cache topology information. For arm64, if ACPI is enabled, determine the max number of cache levels and populate them using the PPTT table if one is available. Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: NVijaya Kumar K <vkilari@codeaurora.org> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: NTomasz Nowicki <Tomasz.Nowicki@cavium.com> Reviewed-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jeremy Linton 提交于
Now that we have a PPTT parser, in preparation for its use on arm64, lets build it. Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: NVijaya Kumar K <vkilari@codeaurora.org> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: NTomasz Nowicki <Tomasz.Nowicki@cavium.com> Reviewed-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jeremy Linton 提交于
Its helpful to be able to lookup the acpi_processor_id associated with a logical cpu. Provide an arm64 helper to do this. Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: NVijaya Kumar K <vkilari@codeaurora.org> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: NTomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jeremy Linton 提交于
The original intent in cacheinfo was that an architecture specific populate_cache_leaves() would probe the hardware and then cache_shared_cpu_map_setup() and cache_override_properties() would provide firmware help to extend/expand upon what was probed. Arm64 was really the only architecture that was working this way, and with the removal of most of the hardware probing logic it became clear that it was possible to simplify the logic a bit. This patch combines the walk of the DT nodes with the code updating the cache size/line_size and nr_sets. cache_override_properties() (which was DT specific) is then removed. The result is that cacheinfo.of_node is no longer used as a temporary place to hold DT references for future calls that update cache properties. That change helps to clarify its one remaining use (matching cacheinfo nodes that represent shared caches) which will be used by the ACPI/PPTT code in the following patches. Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: NVijaya Kumar K <vkilari@codeaurora.org> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: NTomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: NSudeep Holla <sudeep.holla@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 16 5月, 2018 4 次提交
-
-
由 Will Deacon 提交于
When waiting for a cacheline to change state in cmpwait, we may immediately wake-up the first time around the outer loop if the event register was already set (for example, because of the event stream). Avoid these spurious wakeups by explicitly clearing the event register before loading the cacheline and setting the exclusive monitor. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Robin Murphy 提交于
It is probably safe to assume that all Armv8-A implementations have a multiplier whose efficiency is comparable or better than a sequence of three or so register-dependent arithmetic instructions. Select ARCH_HAS_FAST_MULTIPLIER to get ever-so-slightly nicer codegen in the few dusty old corners which care. In a contrived benchmark calling hweight64() in a loop, this does indeed turn out to be a small win overall, with no measurable impact on Cortex-A57 but about 5% performance improvement on Cortex-A53. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Vincenzo Frascino 提交于
"make includecheck" detected few duplicated includes in arch/arm64. This patch removes the double inclusions. Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Masahiro Yamada 提交于
VMLINUX_SYMBOL() is no-op unless CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX is defined. It has ever been selected only by BLACKFIN and METAG. VMLINUX_SYMBOL() is unneeded for ARM64-specific code. Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 15 5月, 2018 1 次提交
-
-
由 Catalin Marinas 提交于
This patch increases the ARCH_DMA_MINALIGN to 128 so that it covers the currently known Cache Writeback Granule (CTR_EL0.CWG) on arm64 and moves the fallback in cache_line_size() from L1_CACHE_BYTES to this constant. In addition, it warns (and taints) if the CWG is larger than ARCH_DMA_MINALIGN as this is not safe with non-coherent DMA. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 11 5月, 2018 1 次提交
-
-
由 Catalin Marinas 提交于
This reverts commit 97303480. Commit 97303480 ("arm64: Increase the max granular size") increased the cache line size to 128 to match Cavium ThunderX, apparently for some performance benefit which could not be confirmed. This change, however, has an impact on the network packet allocation in certain circumstances, requiring slightly over a 4K page with a significant performance degradation. The patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache line). Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 06 5月, 2018 1 次提交
-
-
由 Anthoine Bourgeois 提交于
Since the commit "8003c9ae: add APIC Timer periodic/oneshot mode VMX preemption timer support", a Windows 10 guest has some erratic timer spikes. Here the results on a 150000 times 1ms timer without any load: Before 8003c9ae | After 8003c9ae Max 1834us | 86000us Mean 1100us | 1021us Deviation 59us | 149us Here the results on a 150000 times 1ms timer with a cpu-z stress test: Before 8003c9ae | After 8003c9ae Max 32000us | 140000us Mean 1006us | 1997us Deviation 140us | 11095us The root cause of the problem is starting hrtimer with an expiry time already in the past can take more than 20 milliseconds to trigger the timer function. It can be solved by forward such past timers immediately, rather than submitting them to hrtimer_start(). In case the timer is periodic, update the target expiration and call hrtimer_start with it. v2: Check if the tsc deadline is already expired. Thank you Mika. v3: Execute the past timers immediately rather than submitting them to hrtimer_start(). v4: Rearm the periodic timer with advance_periodic_target_expiration() a simpler version of set_target_expiration(). Thank you Paolo. Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NAnthoine Bourgeois <anthoine.bourgeois@blade-group.com> 8003c9ae ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support") Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 04 5月, 2018 2 次提交
-
-
由 James Morse 提交于
Proxying the cpuif accesses at EL2 makes use of vcpu_data_guest_to_host and co, which check the endianness, which call into vcpu_read_sys_reg... which isn't mapped at EL2 (it was inlined before, and got moved OoL with the VHE optimizations). The result is of course a nice panic. Let's add some specialized cruft to keep the broken platforms that require this hack alive. But, this code used vcpu_data_guest_to_host(), which expected us to write the value to host memory, instead we have trapped the guest's read or write to an mmio-device, and are about to replay it using the host's readl()/writel() which also perform swabbing based on the host endianness. This goes wrong when both host and guest are big-endian, as readl()/writel() will undo the guest's swabbing, causing the big-endian value to be written to device-memory. What needs doing? A big-endian guest will have pre-swabbed data before storing, undo this. If its necessary for the host, writel() will re-swab it. For a read a big-endian guest expects to swab the data after the load. The hosts's readl() will correct for host endianness, giving us the device-memory's value in the register. For a big-endian guest, swab it as if we'd only done the load. For a little-endian guest, nothing needs doing as readl()/writel() leave the correct device-memory value in registers. Tested on Juno with that rarest of things: a big-endian 64K host. Based on a patch from Marc Zyngier. Reported-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Fixes: bf8feb39 ("arm64: KVM: vgic-v2: Add GICV access from HYP") Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 James Morse 提交于
A typo in kvm_vcpu_set_be()'s call: | vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr) causes us to use the 32bit register value as an index into the sys_reg[] array, and sail off the end of the linear map when we try to bring up big-endian secondaries. | Unable to handle kernel paging request at virtual address ffff80098b982c00 | Mem abort info: | ESR = 0x96000045 | Exception class = DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | Data abort info: | ISV = 0, ISS = 0x00000045 | CM = 0, WnR = 1 | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a | [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000 | Internal error: Oops: 96000045 [#1] PREEMPT SMP | Modules linked in: | CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323 | Hardware name: ARM Juno development board (r1) (DT) | pstate: 60000005 (nZCv daif -PAN -UAO) | pc : vcpu_write_sys_reg+0x50/0x134 | lr : vcpu_write_sys_reg+0x50/0x134 | Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b) | Call trace: | vcpu_write_sys_reg+0x50/0x134 | kvm_psci_vcpu_on+0x14c/0x150 | kvm_psci_0_2_call+0x244/0x2a4 | kvm_hvc_call_handler+0x1cc/0x258 | handle_hvc+0x20/0x3c | handle_exit+0x130/0x1ec | kvm_arch_vcpu_ioctl_run+0x340/0x614 | kvm_vcpu_ioctl+0x4d0/0x840 | do_vfs_ioctl+0xc8/0x8d0 | ksys_ioctl+0x78/0xa8 | sys_ioctl+0xc/0x18 | el0_svc_naked+0x30/0x34 | Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8) |---[ end trace 4b4a4f9628596602 ]--- Fix the order of the arguments. Fixes: 8d404c4c ("KVM: arm64: Rewrite system register accessors to read/write functions") CC: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 03 5月, 2018 4 次提交
-
-
由 Helge Deller 提交于
Fix three section mismatches: 1) Section mismatch in reference from the function ioread8() to the function .init.text:pcibios_init_bridge() 2) Section mismatch in reference from the function free_initmem() to the function .init.text:map_pages() 3) Section mismatch in reference from the function ccio_ioc_init() to the function .init.text:count_parisc_driver() Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Helge Deller 提交于
Fix two section mismatches in drivers.c: 1) Section mismatch in reference from the function alloc_tree_node() to the function .init.text:create_tree_node(). 2) Section mismatch in reference from the function walk_native_bus() to the function .init.text:alloc_pa_dev(). Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Daniel Borkmann 提交于
The JIT logic in jit_subprogs() is as follows: for all subprogs we allocate a bpf_prog_alloc(), populate it (prog->is_func = 1 here), and pass it to bpf_int_jit_compile(). If a failure occurred during JIT and prog->jited is not set, then we bail out from attempting to JIT the whole program, and punt to the interpreter instead. In case JITing went successful, we fixup BPF call offsets and do another pass to bpf_int_jit_compile() (extra_pass is true at that point) to complete JITing calls. Given that requires to pass JIT context around addrs and jit_data from x86 JIT are freed in the extra_pass in bpf_int_jit_compile() when calls are involved (if not, they can be freed immediately). However, if in the original pass, the JIT image didn't converge then we leak addrs and jit_data since image itself is NULL, the prog->is_func is set and extra_pass is false in that case, meaning both will become unreachable and are never cleaned up, therefore we need to free as well on !image. Only x64 JIT is affected. Fixes: 1c2a088a ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Daniel Borkmann 提交于
While reviewing x64 JIT code, I noticed that we leak the prior allocated JIT image in the case where proglen != oldproglen during the JIT passes. Prior to the commit e0ee9c12 ("x86: bpf_jit: fix two bugs in eBPF JIT compiler") we would just break out of the loop, and using the image as the JITed prog since it could only shrink in size anyway. After e0ee9c12, we would bail out to out_addrs label where we free addrs and jit_data but not the image coming from bpf_jit_binary_alloc(). Fixes: e0ee9c12 ("x86: bpf_jit: fix two bugs in eBPF JIT compiler") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 02 5月, 2018 5 次提交
-
-
由 Thomas Gleixner 提交于
The recent commt which addresses the x86_phys_bits corruption with encrypted memory on CPUID reload after a microcode update lost the reload of CPUID_8000_0008_EBX as well. As a consequence IBRS and IBRS_FW are not longer detected Restore the behaviour by bringing the reload of CPUID_8000_0008_EBX back. This restore has a twist due to the convoluted way the cpuid analysis works: CPUID_8000_0008_EBX is used by AMD to enumerate IBRB, IBRS, STIBP. On Intel EBX is not used. But the speculation control code sets the AMD bits when running on Intel depending on the Intel specific speculation control bits. This was done to use the same bits for alternatives. The change which moved the 8000_0008 evaluation out of get_cpu_cap() broke this nasty scheme due to ordering. So that on Intel the store to CPUID_8000_0008_EBX clears the IBRB, IBRS, STIBP bits which had been set before by software. So the actual CPUID_8000_0008_EBX needs to go back to the place where it was and the phys/virt address space calculation cannot touch it. In hindsight this should have used completely synthetic bits for IBRB, IBRS, STIBP instead of reusing the AMD bits, but that's for 4.18. /me needs to find time to cleanup that steaming pile of ... Fixes: d94a155c ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") Reported-by: NJörg Otte <jrg.otte@gmail.com> Reported-by: NTim Chen <tim.c.chen@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NJörg Otte <jrg.otte@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: kirill.shutemov@linux.intel.com Cc: Borislav Petkov <bp@alien8.de Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1805021043510.1668@nanos.tec.linutronix.de
-
由 Peter Zijlstra 提交于
mark_tsc_unstable() also needs to affect tsc_early, Now that clocksource_mark_unstable() can be used on a clocksource irrespective of its registration state, use it on both tsc_early and tsc. This does however require cs->list to be initialized empty, otherwise it cannot tell the registation state before registation. Fixes: aa83c457 ("x86/tsc: Introduce early tsc clocksource") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NDiego Viola <diego.viola@gmail.com> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: rui.zhang@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180430100344.533326547@infradead.org
-
由 Peter Zijlstra 提交于
Don't leave the tsc-early clocksource registered if it errors out early. This was reported by Diego, who on his Core2 era machine got TSC invalidated while it was running with tsc-early (due to C-states). This results in keeping tsc-early with very bad effects. Reported-and-Tested-by: NDiego Viola <diego.viola@gmail.com> Fixes: aa83c457 ("x86/tsc: Introduce early tsc clocksource") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: diego.viola@gmail.com Cc: rui.zhang@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180430100344.350507853@infradead.org
-
由 Arnd Bergmann 提交于
This is needed to link ipv6 as a loadable module, which in turn happens in allmodconfig. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Arnd Bergmann 提交于
We already have memcpy_toio(), but not memset_io(), so let's add the obvious version to allow building an allmodconfig kernel without errors like drivers/gpu/drm/ttm/ttm_bo_util.c: In function 'ttm_bo_move_memcpy': drivers/gpu/drm/ttm/ttm_bo_util.c:390:3: error: implicit declaration of function 'memset_io' [-Werror=implicit-function-declaration] Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
- 01 5月, 2018 2 次提交
-
-
由 Arvind Yadav 提交于
Never directly free @dev after calling device_register(), even if it returned an error. Always use put_device() to give up the reference initialized. Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rob Gardner 提交于
The license text in both oradax files mistakenly specifies "version 3" of the GNU General Public License. This is corrected to specify "version 2". Signed-off-by: NRob Gardner <rob.gardner@oracle.com> Signed-off-by: NJonathan Helman <jonathan.helman@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 4月, 2018 1 次提交
-
-
由 KarimAllah Ahmed 提交于
Move DISABLE_EXITS KVM capability bits to the UAPI just like the rest of capabilities. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 27 4月, 2018 3 次提交
-
-
由 Junaid Shahid 提交于
Currently, KVM flushes the TLB after a change to the APIC access page address or the APIC mode when EPT mode is enabled. However, even in shadow paging mode, a TLB flush is needed if VPIDs are being used, as specified in the Intel SDM Section 29.4.5. So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will flush if either EPT or VPIDs are in use. Signed-off-by: NJunaid Shahid <junaids@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Andy Lutomirski 提交于
32-bit user code that uses int $80 doesn't care about r8-r11. There is, however, some 64-bit user code that intentionally uses int $0x80 to invoke 32-bit system calls. From what I've seen, basically all such code assumes that r8-r15 are all preserved, but the kernel clobbers r8-r11. Since I doubt that there's any code that depends on int $0x80 zeroing r8-r11, change the kernel to preserve them. I suspect that very little user code is broken by the old clobber, since r8-r11 are only rarely allocated by gcc, and they're clobbered by function calls, so they only way we'd see a problem is if the same function that invokes int $0x80 also spills something important to one of these registers. The current behavior seems to date back to the historical commit "[PATCH] x86-64 merge for 2.6.4". Before that, all regs were preserved. I can't find any explanation of why this change was made. Update the test_syscall_vdso_32 testcase as well to verify the new behavior, and it strengthens the test to make sure that the kernel doesn't accidentally permute r8..r15. Suggested-by: NDenys Vlasenko <dvlasenk@redhat.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org
-
由 Arnd Bergmann 提交于
A bugfix broke the x32 shmid64_ds and msqid64_ds data structure layout (as seen from user space) a few years ago: Originally, __BITS_PER_LONG was defined as 64 on x32, so we did not have padding after the 64-bit __kernel_time_t fields, After __BITS_PER_LONG got changed to 32, applications would observe extra padding. In other parts of the uapi headers we seem to have a mix of those expecting either 32 or 64 on x32 applications, so we can't easily revert the path that broke these two structures. Instead, this patch decouples x32 from the other architectures and moves it back into arch specific headers, partially reverting the even older commit 73a2d096 ("x86: remove all now-duplicate header files"). It's not clear whether this ever made any difference, since at least glibc carries its own (correct) copy of both of these header files, so possibly no application has ever observed the definitions here. Based on a suggestion from H.J. Lu, I tried out the tool from https://github.com/hjl-tools/linux-header to find other such bugs, which pointed out the same bug in statfs(), which also has a separate (correct) copy in glibc. Fixes: f4b4aae1 ("x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: "H . J . Lu" <hjl.tools@gmail.com> Cc: Jeffrey Walton <noloader@gmail.com> Cc: stable@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180424212013.3967461-1-arnd@arndb.de
-