提交 · 9c15bb1d0a8411f9bb3395d21d5309bde7da0c1c · openeuler / Kernel

31 10月, 2013 1 次提交

kvm: Add KVM_GET_EMULATED_CPUID · 9c15bb1d

由 Borislav Petkov 提交于 9月 22, 2013

Add a kvm ioctl which states which system functionality kvm emulates.
The format used is that of CPUID and we return the corresponding CPUID
bits set for which we do emulate functionality.

Make sure ->padding is being passed on clean from userspace so that we
can use it for something in the future, after the ioctl gets cast in
stone.

s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
it.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9c15bb1d

28 10月, 2013 3 次提交

nVMX: Report CPU_BASED_VIRTUAL_NMI_PENDING as supported · a294c9bb

由 Jan Kiszka 提交于 10月 23, 2013

If the host supports it, we can and should expose it to the guest as
well, just like we already do with PIN_BASED_VIRTUAL_NMIS.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a294c9bb

nVMX: Fix pick-up of uninjected NMIs · cd2633c5

由 Jan Kiszka 提交于 10月 23, 2013

__vmx_complete_interrupts stored uninjected NMIs in arch.nmi_injected,
not arch.nmi_pending. So we actually need to check the former field in
vmcs12_save_pending_event. This fixes the eventinj unit test when run
in nested KVM.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cd2633c5

KVM: nVMX: Report 2MB EPT pages as supported · d3134dbf

由 Jan Kiszka 提交于 10月 23, 2013

As long as the hardware provides us 2MB EPT pages, we can also expose
them to the guest because our shadow EPT code already supports this
feature.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d3134dbf

18 10月, 2013 4 次提交

KVM: ARM: Transparent huge page (THP) support · 9b5fdb97

由 Christoffer Dall 提交于 10月 02, 2013

Support transparent huge pages in KVM/ARM and KVM/ARM64.  The
transparent_hugepage_adjust is not very pretty, but this is also how
it's solved on x86 and seems to be simply an artifact on how THPs
behave.  This should eventually be shared across architectures if
possible, but that can always be changed down the road.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

9b5fdb97

KVM: ARM: Support hugetlbfs backed huge pages · ad361f09

由 Christoffer Dall 提交于 11月 01, 2012

Support huge pages in KVM/ARM and KVM/ARM64.  The pud_huge checking on
the unmap path may feel a bit silly as the pud_huge check is always
defined to false, but the compiler should be smart about this.

Note: This deals only with VMAs marked as huge which are allocated by
users through hugetlbfs only.  Transparent huge pages can only be
detected by looking at the underlying pages (or the page tables
themselves) and this patch so far simply maps these on a page-by-page
level in the Stage-2 page tables.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

ad361f09

KVM: ARM: Update comments for kvm_handle_wfi · 86ed81aa

由 Christoffer Dall 提交于 10月 15, 2013

Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.

Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

86ed81aa

ARM: KVM: Yield CPU when vcpu executes a WFE · 58d5ec8f

由 Marc Zyngier 提交于 10月 08, 2013

On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Quick test to estimate the performance: hackbench 1 process 1000

2xA15 host (baseline):	1.843s

2xA15 guest w/o patch:	2.083s
4xA15 guest w/o patch:	80.212s
8xA15 guest w/o patch:	Could not be bothered to find out

2xA15 guest w/ patch:	2.102s
4xA15 guest w/ patch:	3.205s
8xA15 guest w/ patch:	6.887s

So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

58d5ec8f

15 10月, 2013 3 次提交

KVM: Enable pvspinlock after jump_label_init() to avoid VM hang · 3dbef3e3

由 Raghavendra K T 提交于 10月 09, 2013

We use jump label to enable pv-spinlock. With the changes in (442e0973
Merge branch 'x86/jumplabel'), the jump label behaviour has changed
that would result in eventual hang of the VM since we would end up in a
situation where slow path locks would halt the vcpus but we will not be
able to wakeup the vcpu by lock releaser using unlock kick.

Similar problem in Xen and more detailed description is available in
a945928e (xen: Do not enable spinlocks before jump_label_init()
has executed)

This patch splits kvm_spinlock_init to separate jump label changes with
pvops patching and also make jump label enabling after jump_label_init().
Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

3dbef3e3

KVM: Drop FOLL_GET in GUP when doing async page fault · f2e10669

由 chai wen 提交于 10月 14, 2013

Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP.  Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.
Suggested-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NGu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Nchai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

f2e10669

Revert "ARM: init: add support for reserved memory defined by device tree" · cebf3e40

由 Marek Szyprowski 提交于 10月 11, 2013

This reverts commit 10bcdfb8. There is
no consensus on the bindings for the reserved memory, so the code for
handing it will be reverted.
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NGrant Likely <grant.likely@linaro.org>

cebf3e40

14 10月, 2013 9 次提交

KVM: s390: Get rid of KVM_HPAGE defines · a7efdf6b

由 Christoffer Dall 提交于 10月 13, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

a7efdf6b

KVM: PPC: Get rid of KVM_HPAGE defines · 2c5350e9

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

2c5350e9

KVM: ia64: Get rid of KVM_HPAGE defines · bbba7938

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

bbba7938

KVM: mips: Get rid of KVM_HPAGE defines · 015e0513

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

015e0513

KVM: arm64: Get rid of KVM_HPAGE defines · ef0cfe71

由 Christoffer Dall 提交于 10月 02, 2013

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

ef0cfe71

KVM: ARM: Get rid of KVM_HPAGE defines · dc6f6763

由 Christoffer Dall 提交于 10月 02, 2013

The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge
page size is statically defined at compile time and there is only a
single huge page size.

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

dc6f6763

KVM: Move gfn_to_index to x86 specific code · 6d9d41e5

由 Christoffer Dall 提交于 10月 02, 2013

The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures. Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

6d9d41e5

ARM: 7851/1: check for number of arguments in syscall_get/set_arguments() · 3c1532df

由 AKASHI Takahiro 提交于 10月 09, 2013

In ftrace_syscall_enter(),
    syscall_get_arguments(..., 0, n, ...)
        if (i == 0) { <handle ORIG_r0> ...; n--;}
        memcpy(..., n * sizeof(args[0]));
If 'number of arguments(n)' is zero and 'argument index(i)' is also zero in
syscall_get_arguments(), none of arguments should be copied by memcpy().
Otherwise 'n--' can be a big positive number and unexpected amount of data
will be copied. Tracing system calls which take no argument, say sync(void),
may hit this case and eventually make the system corrupted.
This patch fixes the issue both in syscall_get_arguments() and
syscall_set_arguments().

Cc: <stable@vger.kernel.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

3c1532df

ARM: exynos: dts: Update 5250 arch timer node with clock frequency · 4d594dd3

由 Yuvaraj Kumar C D 提交于 9月 18, 2013

Without the "clock-frequency" property in arch timer node, could able
to see the below crash dump.

[<c0014e28>] (unwind_backtrace+0x0/0xf4) from [<c0011808>] (show_stack+0x10/0x14)
[<c0011808>] (show_stack+0x10/0x14) from [<c036ac1c>] (dump_stack+0x7c/0xb0)
[<c036ac1c>] (dump_stack+0x7c/0xb0) from [<c01ab760>] (Ldiv0_64+0x8/0x18)
[<c01ab760>] (Ldiv0_64+0x8/0x18) from [<c0062f60>] (clockevents_config.part.2+0x1c/0x74)
[<c0062f60>] (clockevents_config.part.2+0x1c/0x74) from [<c0062fd8>] (clockevents_config_and_register+0x20/0x2c)
[<c0062fd8>] (clockevents_config_and_register+0x20/0x2c) from [<c02b8e8c>] (arch_timer_setup+0xa8/0x134)
[<c02b8e8c>] (arch_timer_setup+0xa8/0x134) from [<c04b47b4>] (arch_timer_init+0x1f4/0x24c)
[<c04b47b4>] (arch_timer_init+0x1f4/0x24c) from [<c04b40d8>] (clocksource_of_init+0x34/0x58)
[<c04b40d8>] (clocksource_of_init+0x34/0x58) from [<c049ed8c>] (time_init+0x20/0x2c)
[<c049ed8c>] (time_init+0x20/0x2c) from [<c049b95c>] (start_kernel+0x1e0/0x39c)

THis is because the Exynos u-boot, for example on the Chromebooks, doesn't set
up the CNTFRQ register as expected by arch_timer. Instead, we have to specify
the frequency in the device tree like this.
Signed-off-by: NYuvaraj Kumar C D <yuvaraj.cd@samsung.com>
[olof: Changed subject, added comment, elaborated on commit message]
Signed-off-by: NOlof Johansson <olof@lixom.net>

4d594dd3

13 10月, 2013 10 次提交

H
parisc: let probe_kernel_read() capture access to page zero · db080f9c
由 Helge Deller 提交于 10月 09, 2013
```
Signed-off-by: NHelge Deller <deller@gmx.de>
```
db080f9c

parisc: optimize variable initialization in do_page_fault · 2d8b22de

由 John David Anglin 提交于 10月 05, 2013

The attached change defers the initialization of the variables tsk, mm
and flags until they are needed. As a result, the code won't crash if a
kernel probe is done with a corrupt context and the code will be better
optimized.
Signed-off-by: NJohn David Anglin  <dave.anglin@bell.net>
Signed-off-by: NHelge Deller <deller@gmx.de>

2d8b22de

parisc: fix interruption handler to respect pagefault_disable() · 59b33f14

由 Helge Deller 提交于 10月 01, 2013

Running an "echo t > /proc/sysrq-trigger" crashes the parisc kernel.  The
problem is, that in print_worker_info() we try to read the workqueue info via
the probe_kernel_read() functions which use pagefault_disable() to avoid
crashes like this:
    probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
    probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
    probe_kernel_read(name, wq->name, sizeof(name) - 1);

The problem here is, that the first probe_kernel_read(&pwq) might return zero
in pwq and as such the following probe_kernel_reads() try to access contents of
the page zero which is read protected and generate a kernel segfault.

With this patch we fix the interruption handler to call parisc_terminate()
directly only if pagefault_disable() was not called (in which case
preempt_count()==0).  Otherwise we hand over to the pagefault handler which
will try to look up the faulting address in the fixup tables.
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v3.0+
Signed-off-by: NJohn David Anglin  <dave.anglin@bell.net>
Signed-off-by: NHelge Deller <deller@gmx.de>

59b33f14

H
parisc: mark parisc_terminate() noreturn and cold. · a60ac4b5
由 Helge Deller 提交于 10月 09, 2013
```
Signed-off-by: NHelge Deller <deller@gmx.de>
```
a60ac4b5
H
parisc: remove unused syscall_ipi() function. · ec7c2419
由 Helge Deller 提交于 10月 09, 2013
```
Signed-off-by: NHelge Deller <deller@gmx.de>
```
ec7c2419

parisc: kill SMP single function call interrupt · 528d8eb2

由 Jiang Liu 提交于 9月 12, 2013

Commit 9a46ad6d "smp: make smp_call_function_many() use logic
similar to smp_call_function_single()" has unified the way to handle
single and multiple cross-CPU function calls. Now only one interrupt
is needed for architecture specific code to support generic SMP function
call interfaces, so kill the redundant single function call interrupt.
Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Signed-off-by: NHelge Deller <deller@gmx.de>

528d8eb2

parisc: Export flush_cache_page() (needed by lustre) · 320c90be

由 Geert Uytterhoeven 提交于 9月 05, 2013

ERROR: "flush_cache_page" [drivers/staging/lustre/lustre/libcfs/libcfs.ko] undefined!
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NHelge Deller <deller@gmx.de>

320c90be

KVM: ARM: Add support for Cortex-A7 · e8c2d99f

由 Jonathan Austin 提交于 9月 26, 2013

This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts.

As Cortex-A7 is architecturally compatible with A15, this patch is largely just
generalising existing code. Areas where 'implementation defined' behaviour
is identical for A7 and A15 is moved to allow it to be used by both cores.

The check to ensure that coprocessor register tables are sorted correctly is
also moved in to 'common' code to avoid each new cpu doing its own check
(and possibly forgetting to do so!)
Signed-off-by: NJonathan Austin <jonathan.austin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

e8c2d99f

KVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masks · 5e497046

由 Jonathan Austin 提交于 9月 26, 2013

The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor
format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently
defines TTBCR_T{0,1}SZ as 3, not 7.

The T0SZ mask is used to calculate the value for the HTCR, both to pick out
TTBCR.T0SZ and mask off the equivalent field in the HTCR during
read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value
of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when
initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on
A7/TC2)

Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness.
Signed-off-by: NJonathan Austin <jonathan.austin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

5e497046

KVM: ARM: Fix calculation of virtual CPU ID · 1158fca4

由 Jonathan Austin 提交于 9月 26, 2013

KVM does not have a notion of multiple clusters for CPUs, just a linear
array of CPUs. When using a system with cores in more than one cluster, the
current method for calculating the virtual MPIDR will leak the (physical)
cluster information into the virtual MPIDR. One effect of this is that
Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster.

This patch does away with exposing the real MPIDR fields in favour of simply
using the virtual CPU number (but preserving the U bit, as before).
Signed-off-by: NJonathan Austin <jonathan.austin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

1158fca4

12 10月, 2013 1 次提交

ARC: Ignore ptrace SETREGSET request for synthetic register "stop_pc" · 5b242828

由 Vineet Gupta 提交于 10月 10, 2013

ARCompact TRAP_S insn used for breakpoints, commits before exception is
taken (updating architectural PC). So ptregs->ret contains next-PC and
not the breakpoint PC itself. This is different from other restartable
exceptions such as TLB Miss where ptregs->ret has exact faulting PC.
gdb needs to know exact-PC hence ARC ptrace GETREGSET provides for
@stop_pc which returns ptregs->ret vs. EFA depending on the
situation.

However, writing stop_pc (SETREGSET request), which updates ptregs->ret
doesn't makes sense stop_pc doesn't always correspond to that reg as
described above.

This was not an issue so far since user_regs->ret / user_regs->stop_pc
had same value and both writing to ptregs->ret was OK, needless, but NOT
broken, hence not observed.

With gdb "jump", they diverge, and user_regs->ret updating ptregs is
overwritten immediately with stop_pc, which this patch fixes.
Reported-by: NAnton Kolesov <akolesov@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

5b242828

11 10月, 2013 2 次提交

compiler/gcc4: Add quirk for 'asm goto' miscompilation bug · 3f0116c3

由 Ingo Molnar 提交于 10月 10, 2013

Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670

Implement a workaround suggested by Jakub Jelinek.
Reported-and-tested-by: NFengguang Wu <fengguang.wu@intel.com>
Reported-by: NOleg Nesterov <oleg@redhat.com>
Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: NJakub Jelinek <jakub@redhat.com>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

3f0116c3

KVM: nVMX: Fully support nested VMX preemption timer · 7854cbca

由 Arthur Chunqi Li 提交于 9月 16, 2013

This patch contains the following two changes:
1. Fix the bug in nested preemption timer support. If vmexit L2->L0
with some reasons not emulated by L1, preemption timer value should
be save in such exits.
2. Add support of "Save VMX-preemption timer value" VM-Exit controls
to nVMX.

With this patch, nested VMX preemption timer features are fully
supported.
Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com>
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7854cbca

10 10月, 2013 4 次提交

xen: Fix possible user space selector corruption · 7cde9b27

由 Frediano Ziglio 提交于 10月 10, 2013

Due to the way kernel is initialized under Xen is possible that the
ring1 selector used by the kernel for the boot cpu end up to be copied
to userspace leading to segmentation fault in the userspace.

Xen code in the kernel initialize no-boot cpus with correct selectors (ds
and es set to __USER_DS) but the boot one keep the ring1 (passed by Xen).
On task context switch (switch_to) we assume that ds, es and cs already
point to __USER_DS and __KERNEL_CSso these selector are not changed.

If processor is an Intel that support sysenter instruction sysenter/sysexit
is used so ds and es are not restored switching back from kernel to
userspace. In the case the selectors point to a ring1 instead of __USER_DS
the userspace code will crash on first memory access attempt (to be
precise Xen on the emulated iret used to do sysexit will detect and set ds
and es to zero which lead to GPF anyway).

Now if an userspace process call kernel using sysenter and get rescheduled
(for me it happen on a specific init calling wait4) could happen that the
ring1 selector is set to ds and es.

This is quite hard to detect cause after a while these selectors are fixed
(__USER_DS seems sticky).

Bisecting the code commit 7076aada appears
to be the first one that have this issue.
Signed-off-by: NFrediano Ziglio <frediano.ziglio@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NAndrew Cooper <andrew.cooper3@citrix.com>

7cde9b27

kvm: ppc: booke: check range page invalidation progress on page setup · 40fde70d

由 Bharat Bhushan 提交于 8月 07, 2013

When the MM code is invalidating a range of pages, it calls the KVM
kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
kvm_unmap_hva_range(), which arranges to flush all the TLBs for guest pages.
However, the Linux PTEs for the range being flushed are still valid at
that point.  We are not supposed to establish any new references to pages
in the range until the ...range_end() notifier gets called.
The PPC-specific KVM code doesn't get any explicit notification of that;
instead, we are supposed to use mmu_notifier_retry() to test whether we
are or have been inside a range flush notifier pair while we have been
referencing a page.

This patch calls the mmu_notifier_retry() while mapping the guest
page to ensure we are not referencing a page when in range invalidation.

This call is inside a region locked with kvm->mmu_lock, which is the
same lock that is called by the KVM MMU notifier functions, thus
ensuring that no new notification can proceed while we are in the
locked region.
Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
Acked-by: NAlexander Graf <agraf@suse.de>
[Backported to 3.12 - Paolo]
Reviewed-by: NBharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

40fde70d

KVM: PPC: Book3S HV: Fix typo in saving DSCR · cfc86025

由 Paul Mackerras 提交于 9月 21, 2013

This fixes a typo in the code that saves the guest DSCR (Data Stream
Control Register) into the kvm_vcpu_arch struct on guest exit.  The
effect of the typo was that the DSCR value was saved in the wrong place,
so changes to the DSCR by the guest didn't persist across guest exit
and entry, and some host kernel memory got corrupted.

Cc: stable@vger.kernel.org [v3.1+]
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cfc86025

KVM: nVMX: fix shadow on EPT · d0d538b9

由 Gleb Natapov 提交于 10月 09, 2013

72f85795 broke shadow on EPT. This patch reverts it and fixes PAE
on nEPT (which reverted commit fixed) in other way.

Shadow on EPT is now broken because while L1 builds shadow page table
for L2 (which is PAE while L2 is in real mode) it never loads L2's
GUEST_PDPTR[0-3].  They do not need to be loaded because without nested
virtualization HW does this during guest entry if EPT is disabled,
but in our case L0 emulates L2's vmentry while EPT is enables, so we
cannot rely on vmcs12->guest_pdptr[0-3] to contain up-to-date values
and need to re-read PDPTEs from L2 memory. This is what kvm_set_cr3()
is doing, but by clearing cache bits during L2 vmentry we drop values
that kvm_set_cr3() read from memory.

So why the same code does not work for PAE on nEPT? kvm_set_cr3()
reads pdptes into vcpu->arch.walk_mmu->pdptrs[]. walk_mmu points to
vcpu->arch.nested_mmu while nested guest is running, but ept_load_pdptrs()
uses vcpu->arch.mmu which contain incorrect values. Fix that by using
walk_mmu in ept_(load|save)_pdptrs.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Tested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d0d538b9

09 10月, 2013 3 次提交

ARM: OMAP2: RX-51: Add missing max_current to rx51_lp5523_led_config · d1f1ca36

由 Pali Rohár 提交于 9月 23, 2013

File drivers/leds/leds-lp55xx-common.c refuse to change led_current sysfs
attribute if value is higher than max_current specified in board file. By default
global C variables are zero, so changing always failed. This patch adding missing
max_current and setting it to max safe value 100 (10 mA).

It is unclear which commit exactly caused this regression as the lp5523
driver was broken and was hiding the platform data breakage. Now
the driver is fixed so this should be fixed as well.
Signed-off-by: NPali Rohár <pali.rohar@gmail.com>
Signed-off-by: NJoerg Reisenweber <joerg@openmoko.org>
[tony@atomide.com: updated comments to describe regression]
Signed-off-by: NTony Lindgren <tony@atomide.com>

d1f1ca36

ARM: mach-omap2: board-generic: fix undefined symbol · 0b8214fe

由 Simon Barth 提交于 10月 08, 2013

Since dra7 reuses the  function 'omap5_realtime_timer_init' in
arch/arm/mach-omap2/board-generic.c as timer init function, it has to be
built for this SoC as well.
Signed-off-by: NSimon Barth <Simon.Pe.Barth@gmail.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

0b8214fe

ARM: dts: Fix pinctrl mask for omap3 · d623a0e1

由 Tony Lindgren 提交于 10月 07, 2013

The wake-up interrupt bit is available on omap3/4/5 processors
unlike what we claim. Without fixing it we cannot use it on
omap3 and the system configured for wake-up events will just
hang on wake-up.

Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Benoît Cousson <bcousson@baylibre.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: NTony Lindgren <tony@atomide.com>

d623a0e1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功