提交 · 25d92081ae2ff9858fa733621ef8e91d30fec9d0 · openanolis / cloud-kernel

07 8月, 2013 12 次提交

nEPT: Add nEPT violation/misconfigration support · 25d92081

由 Yang Zhang 提交于 8月 06, 2013

Inject nEPT fault to L1 guest. This patch is original from Xinhao.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NJun Nakajima <jun.nakajima@intel.com>
Signed-off-by: NXinhao Xu <xinhao.xu@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

25d92081

nEPT: correctly check if remote tlb flush is needed for shadowed EPT tables · 53166229

由 Gleb Natapov 提交于 8月 05, 2013

need_remote_flush() assumes that shadow page is in PT64 format, but
with addition of nested EPT this is no longer always true. Fix it by
bits definitions that depend on host shadow page type.
Reported-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

53166229

nEPT: Redefine EPT-specific link_shadow_page() · 7a1638ce

由 Yang Zhang 提交于 8月 05, 2013

Since nEPT doesn't support A/D bit, so we should not set those bit
when build shadow page table.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7a1638ce

nEPT: Add EPT tables support to paging_tmpl.h · 37406aaa

由 Nadav Har'El 提交于 8月 05, 2013

This is the first patch in a series which adds nested EPT support to KVM's
nested VMX. Nested EPT means emulating EPT for an L1 guest so that L1 can use
EPT when running a nested guest L2. When L1 uses EPT, it allows the L2 guest
to set its own cr3 and take its own page faults without either of L0 or L1
getting involved. This often significanlty improves L2's performance over the
previous two alternatives (shadow page tables over EPT, and shadow page
tables over shadow page tables).

This patch adds EPT support to paging_tmpl.h.

paging_tmpl.h contains the code for reading and writing page tables. The code
for 32-bit and 64-bit tables is very similar, but not identical, so
paging_tmpl.h is #include'd twice in mmu.c, once with PTTTYPE=32 and once
with PTTYPE=64, and this generates the two sets of similar functions.

There are subtle but important differences between the format of EPT tables
and that of ordinary x86 64-bit page tables, so for nested EPT we need a
third set of functions to read the guest EPT table and to write the shadow
EPT table.

So this patch adds third PTTYPE, PTTYPE_EPT, which creates functions (prefixed
with "EPT") which correctly read and write EPT tables.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NJun Nakajima <jun.nakajima@intel.com>
Signed-off-by: NXinhao Xu <xinhao.xu@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

37406aaa

nEPT: Support shadow paging for guest paging without A/D bits · 61719a8f

由 Gleb Natapov 提交于 8月 05, 2013

Some guest paging modes do not support A/D bits. Add support for such
modes in shadow page code. For such modes PT_GUEST_DIRTY_MASK,
PT_GUEST_ACCESSED_MASK, PT_GUEST_DIRTY_SHIFT and PT_GUEST_ACCESSED_SHIFT
should be set to zero.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

61719a8f

nEPT: make guest's A/D bits depends on guest's paging mode · d8089bac

由 Gleb Natapov 提交于 8月 05, 2013

This patch makes guest A/D bits definition to be dependable on paging
mode, so when EPT support will be added it will be able to define them
differently.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d8089bac

nEPT: Move common code to paging_tmpl.h · 0ad805a0

由 Nadav Har'El 提交于 8月 05, 2013

For preparation, we just move gpte_access(), prefetch_invalid_gpte(),
s_rsvd_bits_set(), protect_clean_gpte() and is_dirty_gpte() from mmu.c
to paging_tmpl.h.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NJun Nakajima <jun.nakajima@intel.com>
Signed-off-by: NXinhao Xu <xinhao.xu@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NJun Nakajima <jun.nakajima@intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0ad805a0

nEPT: Fix wrong test in kvm_set_cr3 · b7e91450

由 Nadav Har'El 提交于 8月 05, 2013

kvm_set_cr3() attempts to check if the new cr3 is a valid guest physical
address. The problem is that with nested EPT, cr3 is an *L2* physical
address, not an L1 physical address as this test expects.

As the comment above this test explains, it isn't necessary, and doesn't
correspond to anything a real processor would do. So this patch removes it.

Note that this wrong test could have also theoretically caused problems
in nested NPT, not just in nested EPT. However, in practice, the problem
was avoided: nested_svm_vmexit()/vmrun() do not call kvm_set_cr3 in the
nested NPT case, and instead set the vmcb (and arch.cr3) directly, thus
circumventing the problem. Additional potential calls to the buggy function
are avoided in that we don't trap cr3 modifications when nested NPT is
enabled. However, because in nested VMX we did want to use kvm_set_cr3()
(as requested in Avi Kivity's review of the original nested VMX patches),
we can't avoid this problem and need to fix it.
Reviewed-by: NOrit Wasserman <owasserm@redhat.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NJun Nakajima <jun.nakajima@intel.com>
Signed-off-by: NXinhao Xu <xinhao.xu@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b7e91450

nEPT: Fix cr3 handling in nested exit and entry · 3633cfc3

由 Nadav Har'El 提交于 8月 05, 2013

The existing code for handling cr3 and related VMCS fields during nested
exit and entry wasn't correct in all cases:

If L2 is allowed to control cr3 (and this is indeed the case in nested EPT),
during nested exit we must copy the modified cr3 from vmcs02 to vmcs12, and
we forgot to do so. This patch adds this copy.

If L0 isn't controlling cr3 when running L2 (i.e., L0 is using EPT), and
whoever does control cr3 (L1 or L2) is using PAE, the processor might have
saved PDPTEs and we should also save them in vmcs12 (and restore later).
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: NOrit Wasserman <owasserm@redhat.com>
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NJun Nakajima <jun.nakajima@intel.com>
Signed-off-by: NXinhao Xu <xinhao.xu@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3633cfc3

nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 · 8049d651

由 Nadav Har'El 提交于 8月 05, 2013

Recent KVM, since http://kerneltrap.org/mailarchive/linux-kvm/2010/5/2/6261577
switch the EFER MSR when EPT is used and the host and guest have different
NX bits. So if we add support for nested EPT (L1 guest using EPT to run L2)
and want to be able to run recent KVM as L1, we need to allow L1 to use this
EFER switching feature.

To do this EFER switching, KVM uses VM_ENTRY/EXIT_LOAD_IA32_EFER if available,
and if it isn't, it uses the generic VM_ENTRY/EXIT_MSR_LOAD. This patch adds
support for the former (the latter is still unsupported).

Nested entry and exit emulation (prepare_vmcs_02 and load_vmcs12_host_state,
respectively) already handled VM_ENTRY/EXIT_LOAD_IA32_EFER correctly. So all
that's left to do in this patch is to properly advertise this feature to L1.

Note that vmcs12's VM_ENTRY/EXIT_LOAD_IA32_EFER are emulated by L0, by using
vmx_set_efer (which itself sets one of several vmcs02 fields), so we always
support this feature, regardless of whether the host supports it.
Reviewed-by: NOrit Wasserman <owasserm@redhat.com>
Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
Signed-off-by: NJun Nakajima <jun.nakajima@intel.com>
Signed-off-by: NXinhao Xu <xinhao.xu@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8049d651

KVM: MMU: fix check the reserved bits on the gpte of L2 · 02766421

由 Xiao Guangrong 提交于 8月 05, 2013

Current code always uses arch.mmu to check the reserved bits on guest gpte
which is valid only for L1 guest, we should use arch.nested_mmu instead when
we translate gva to gpa for the L2 guest

Fix it by using @mmu instead since it is adapted to the current mmu mode
automatically

The bug can be triggered when nested npt is used and L1 guest and L2 guest
use different mmu mode
Reported-by: NJan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

02766421

KVM: nVMX: correctly set tr base on nested vmexit emulation · 205befd9

由 Gleb Natapov 提交于 8月 04, 2013

After commit 21feb4eb tr base is zeroed
during vmexit. Set it to L1's HOST_TR_BASE. This should fix
https://bugzilla.kernel.org/show_bug.cgi?id=60679Reported-by: NYongjie Ren <yongjie.ren@intel.com>
Reviewed-by: NArthur Chunqi Li <yzt356@gmail.com>
Tested-by: NYongjie Ren <yongjie.ren@intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

205befd9

29 7月, 2013 4 次提交

nVMX: reset rflags register cache during nested vmentry. · 63fbf59f

由 Gleb Natapov 提交于 7月 28, 2013

During nested vmentry into vm86 mode a vcpu state is found to be incorrect
because rflags does not have VM flag set since it is read from the cache
and has L1's value instead of L2's. If emulate_invalid_guest_state=1 L0
KVM tries to emulate it, but emulation does not work for nVMX and it
never should happen anyway. Fix that by using vmx_set_rflags() to set
rflags during nested vmentry which takes care of updating register cache.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

63fbf59f

KVM: x86: handle singlestep during emulation · 663f4c61

由 Paolo Bonzini 提交于 6月 25, 2013

This lets debugging work better during emulation of invalid
guest state.

This time the check is done after emulation, but before writeback
of the flags; we need to check the flags *before* execution of the
instruction, we cannot check singlestep_rip because the CS base may
have already been modified.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

Conflicts:
	arch/x86/kvm/x86.c

663f4c61

KVM: x86: handle hardware breakpoints during emulation · 4a1e10d5

由 Paolo Bonzini 提交于 5月 30, 2013

This lets debugging work better during emulation of invalid
guest state.

The check is done before emulating the instruction, and (in the case
of guest debugging) reuses EMULATE_DO_MMIO to exit with KVM_EXIT_DEBUG.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4a1e10d5

KVM: x86: rename EMULATE_DO_MMIO · ac0a48c3

由 Paolo Bonzini 提交于 6月 25, 2013

The next patch will reuse it for other userspace exits than MMIO,
namely debug events.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ac0a48c3

25 7月, 2013 2 次提交

KVM: x86: Drop some unused functions from lapic · 9576c4cd

由 Jan Kiszka 提交于 7月 25, 2013

Both have no users anymore.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

9576c4cd

KVM: x86: Simplify __apic_accept_irq · 11f5cc05

由 Jan Kiszka 提交于 7月 25, 2013

If posted interrupts are enabled, we can no longer track if an IRQ was
coalesced based on IRR. So drop this logic also from the classic
software path and simplify apic_test_and_set_irr to apic_set_irr.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

11f5cc05

20 7月, 2013 1 次提交

perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5 · 103af0a9

由 Andi Kleen 提交于 7月 18, 2013

[KVM maintainers:
The underlying support for this is in perf/core now. So please merge
this patch into the KVM tree.]

This is not arch perfmon, but older CPUs will just ignore it. This makes
it possible to do at least some TSX measurements from a KVM guest

v2: Various fixes to address review feedback
v3: Ignore the bits when no CPUID. No #GP. Force raw events with TSX bits.
v4: Use reserved bits for #GP
v5: Remove obsolete argument
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

103af0a9

18 7月, 2013 10 次提交

KVM: nVMX: Set segment infomation of L1 when L2 exits · 21feb4eb

由 Arthur Chunqi Li 提交于 7月 15, 2013

When L2 exits to L1, segment infomations of L1 are not set correctly.
According to Intel SDM 27.5.2(Loading Host Segment and Descriptor
Table Registers), segment base/limit/access right of L1 should be
set to some designed value when L2 exits to L1. This patch fixes
this.
Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com>
Reviewed-by: NGleb Natapov <gnatapov@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

21feb4eb

remove sched notifier for cross-cpu migrations · e04c5d76

由 Marcelo Tosatti 提交于 7月 10, 2013

Linux as a guest on KVM hypervisor, the only user of the pvclock
vsyscall interface, does not require notification on task migration
because:

1. cpu ID number maps 1:1 to per-CPU pvclock time info.
2. per-CPU pvclock time info is updated if the
   underlying CPU changes.
3. that version is increased whenever underlying CPU
   changes.

Which is sufficient to guarantee nanoseconds counter
is calculated properly.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

e04c5d76

KVM: nVMX: Fix read/write to MSR_IA32_FEATURE_CONTROL · b3897a49

由 Nadav Har'El 提交于 7月 08, 2013

Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment.

This patch simulate this MSR in nested_vmx and the default value is
0x0. BIOS should set it to 0x5 before VMXON. After setting the lock
bit, write to it will cause #GP(0).

Another QEMU patch is also needed to handle emulation of reset
and migration. Reset to vCPU should clear this MSR and migration
should reserve value of it.

This patch is based on Nadav's previous commit.
http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/88478Signed-off-by: NNadav Har'El <nyh@math.technion.ac.il>
Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

b3897a49

KVM: x86: Drop useless cast · 6b61edf7

由 Mathias Krause 提交于 6月 26, 2013

Void pointers don't need no casting, drop it.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

6b61edf7

KVM: VMX: Use proper types to access const arrays · c2bae893

由 Mathias Krause 提交于 6月 26, 2013

Use a const pointer type instead of casting away the const qualifier
from const arrays. Keep the pointer array on the stack, nonetheless.
Making it static just increases the object size.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

c2bae893

KVM: nVMX: Set success rflags when emulate VMXON/VMXOFF in nested virt · a25eb114

由 Arthur Chunqi Li 提交于 7月 04, 2013

Set rflags after successfully emulateing VMXON/VMXOFF in VMX.
Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a25eb114

KVM: nVMX: Change location of 3 functions in vmx.c · 0658fbaa

由 Arthur Chunqi Li 提交于 7月 04, 2013

Move nested_vmx_succeed/nested_vmx_failInvalid/nested_vmx_failValid
ahead of handle_vmon to eliminate double declaration in the same
file
Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0658fbaa

KVM: x86: Avoid zapping mmio sptes twice for generation wraparound · e6dff7d1

由 Takuya Yoshikawa 提交于 7月 04, 2013

Now that kvm_arch_memslots_updated() catches every increment of the
memslots->generation, checking if the mmio generation has reached its
maximum value is enough.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e6dff7d1

KVM: Introduce kvm_arch_memslots_updated() · e59dbe09

由 Takuya Yoshikawa 提交于 7月 04, 2013

This is called right after the memslots is updated, i.e. when the result
of update_memslots() gets installed in install_new_memslots().  Since
the memslots needs to be updated twice when we delete or move a memslot,
kvm_arch_commit_memory_region() does not correspond to this exactly.

In the following patch, x86 will use this new API to check if the mmio
generation has reached its maximum value, in which case mmio sptes need
to be flushed out.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Acked-by: NAlexander Graf <agraf@suse.de>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e59dbe09

KVM: MMU: avoid fast page fault fixing mmio page fault · 1c118b82

由 Xiao Guangrong 提交于 7月 18, 2013

Currently, fast page fault incorrectly tries to fix mmio page fault when
the generation number is invalid (spte.gen != kvm.gen). It then returns
to guest to retry the fault since it sees the last spte is nonpresent.
This causes an infinite loop.

Since fast page fault only works for direct mmu, the issue exists when
1) tdp is enabled. It is only triggered only on AMD host since on Intel host
the mmio page fault is recognized as ept-misconfig whose handler call
fault-page path with error_code = 0

2) guest paging is disabled. Under this case, the issue is hardly discovered
since paging disable is short-lived and the sptes will be invalid after
memslot changed for 150 times

Fix it by filtering out MMIO page faults in page_fault_can_be_fast.
Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1c118b82

12 7月, 2013 1 次提交

perf/x86: Fix incorrect use of do_div() in NMI warning · baf64b85

由 Dave Hansen 提交于 7月 08, 2013

I completely botched understanding the calling conventions of
do_div(). I assumed that do_div() returned the result instead
of realizing that it modifies its argument and returns a
remainder. The side-effect from this would be bogus numbers
for the "msecs" value in the warning messages:

INFO: NMI handler (perf_event_nmi_handler) took too long to run: 0.114 msecs

Note, there was a second fix posted by Stephane Eranian for
a separate patch which I also botched:

http://lkml.kernel.org/r/20130704223010.GA30625@quadSigned-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Hansen <dave@sr71.net>
Link: http://lkml.kernel.org/r/20130708214404.B0B6EA66@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

baf64b85

11 7月, 2013 1 次提交

mm: remove free_area_cache · 98d1e64f

由 Michel Lespinasse 提交于 7月 10, 2013

Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.
Signed-off-by: NMichel Lespinasse <walken@google.com>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98d1e64f

10 7月, 2013 9 次提交

arm: add support for LZ4-compressed kernel · f9b493ac

由 Kyungsik Lee 提交于 7月 08, 2013

Integrates the LZ4 decompression code to the arm pre-boot code.
Signed-off-by: NKyungsik Lee <kyungsik.lee@lge.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Florian Fainelli <florian@openwrt.org>
Cc: Yann Collet <yann.collet.73@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f9b493ac

reboot: move arch/x86 reboot= handling to generic kernel · 1b3a5d02

由 Robin Holt 提交于 7月 08, 2013

Merge together the unicore32, arm, and x86 reboot= command line
parameter handling.
Signed-off-by: NRobin Holt <holt@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1b3a5d02

reboot: x86: prepare reboot_mode for moving to generic kernel code · edf2b139

由 Robin Holt 提交于 7月 08, 2013

Prepare for the moving the parsing of reboot= to the generic kernel code
by making reboot_mode into a more generic form.
Signed-off-by: NRobin Holt <holt@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Miguel Boton <mboton.lkml@gmail.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

edf2b139

ptrace/x86: flush_ptrace_hw_breakpoint() shoule clear the virtual debug registers · f7da04c9

由 Oleg Nesterov 提交于 7月 08, 2013

flush_ptrace_hw_breakpoint() destroys the counters set by ptrace, but
"leaks" ->debugreg6 and ->ptrace_dr7.

The problem is minor, but still it doesn't look right and flush_thread()
did this until commit 66cb5917 ("hw-breakpoints: use the new wrapper
routines to access debug registers in process/thread code").  Now that
PTRACE_DETACH does flush_ too this makes even more sense.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f7da04c9

ptrace/x86: cleanup ptrace_set_debugreg() · 61e305c7

由 Oleg Nesterov 提交于 7月 08, 2013

ptrace_set_debugreg() is trivial but looks horrible.  Kill the unnecessary
goto's and return's to cleanup the code.

This matches ptrace_get_debugreg() which also needs the trivial whitespace
cleanups.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

61e305c7

ptrace/x86: ptrace_write_dr7() should create bp if !disabled · b87a95ad

由 Oleg Nesterov 提交于 7月 08, 2013

Commit 24f1e32c ("hw-breakpoints: Rewrite the hw-breakpoints layer
on top of perf events") introduced the minor regression.  Before this
commit

	PTRACE_POKEUSER DR7, enableDR0
	PTRACE_POKEUSER DR0, address

was perfectly valid, now PTRACE_POKEUSER(DR7) fails if DR0 was not
previously initialized by PTRACE_POKEUSER(DR0).

Change ptrace_write_dr7() to do ptrace_register_breakpoint(addr => 0) if
!bp && !disabled.

This fixes watchpoint-zeroaddr from ptrace-tests, see

    https://bugzilla.redhat.com/show_bug.cgi?id=660204.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Reported-by: NJan Kratochvil <jan.kratochvil@redhat.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b87a95ad

ptrace/x86: introduce ptrace_register_breakpoint() · 9afe33ad

由 Oleg Nesterov 提交于 7月 08, 2013

No functional changes, preparation.

Extract the "register breakpoint" code from ptrace_get_debugreg() into
the new/generic helper, ptrace_register_breakpoint().  It will have more
users.

The patch also adds another simple helper, ptrace_fill_bp_fields(), to
factor out the arch_bp_generic_fields() logic in register/modify.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9afe33ad

ptrace/x86: dont delay "disable" till second pass in ptrace_write_dr7() · 29a55513

由 Oleg Nesterov 提交于 7月 08, 2013

ptrace_write_dr7() skips ptrace_modify_breakpoint(disabled => true)
unless second_pass, this buys nothing but complicates the code and means
that we always do the main loop twice even if "disabled" was never true.

The comment says:

	Don't unregister the breakpoints right-away,
	unless all register_user_hw_breakpoint()
	requests have succeeded.

Firstly, we do not do register_user_hw_breakpoint(), it was removed by
commit 24f1e32c ("hw-breakpoints: Rewrite the hw-breakpoints layer
on top of perf events").

We are going to restore register_user_hw_breakpoint() (see the next
patch) but this doesn't matter: after commit 44234adc
("hw-breakpoints: Modify breakpoints without unregistering them")
perf_event_disable() can not hurt, hw_breakpoint_del() does not free the
slot.

Remove the "second_pass" check from the main loop and simplify the code.
Since we have to check "bp != NULL" anyway, the patch also removes the
same check in ptrace_modify_breakpoint() and moves the comment into
ptrace_write_dr7().

With this patch the second pass is only needed to restore the saved
old_dr7.  This should never fail, so the patch adds WARN_ON() to catch
the potential problems as Frederic suggested.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

29a55513

ptrace/x86: simplify the "disable" logic in ptrace_write_dr7() · e6a7d607

由 Oleg Nesterov 提交于 7月 08, 2013

ptrace_write_dr7() looks unnecessarily overcomplicated.  We can factor
out ptrace_modify_breakpoint() and do not do "continue" twice, just we
need to pass the proper "disabled" argument to
ptrace_modify_breakpoint().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6a7d607

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功