提交 · cf420048b3b2af9ce928d35cc5455c646c9dd2f7 · openeuler / Kernel

15 2月, 2012 1 次提交

x86: Use generic posix_types.h · 07d62021

由 H. Peter Anvin 提交于 2月 07, 2012

Change the x86 architecture to use <asm-generic/posix_types.h>.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1328677745-20121-20-git-send-email-hpa@zytor.com
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>

07d62021

01 2月, 2012 2 次提交

KVM: x86: fix missing checks in syscall emulation · c2226fc9

由 Stephan Bärwolf 提交于 1月 12, 2012

On hosts without this patch, 32bit guests will crash (and 64bit guests
may behave in a wrong way) for example by simply executing following
nasm-demo-application:

    [bits 32]
    global _start
    SECTION .text
    _start: syscall

(I tested it with winxp and linux - both always crashed)

    Disassembly of section .text:

    00000000 <_start>:
       0:   0f 05                   syscall

The reason seems a missing "invalid opcode"-trap (int6) for the
syscall opcode "0f05", which is not available on Intel CPUs
within non-longmodes, as also on some AMD CPUs within legacy-mode.
(depending on CPU vendor, MSR_EFER and cpuid)

Because previous mentioned OSs may not engage corresponding
syscall target-registers (STAR, LSTAR, CSTAR), they remain
NULL and (non trapping) syscalls are leading to multiple
faults and finally crashs.

Depending on the architecture (AMD or Intel) pretended by
guests, various checks according to vendor's documentation
are implemented to overcome the current issue and behave
like the CPUs physical counterparts.

[mtosatti: cleanup/beautify code]
Signed-off-by: NStephan Baerwolf <stephan.baerwolf@tu-ilmenau.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c2226fc9

KVM: x86: extend "struct x86_emulate_ops" with "get_cpuid" · bdb42f5a

由 Stephan Bärwolf 提交于 1月 12, 2012

In order to be able to proceed checks on CPU-specific properties
within the emulator, function "get_cpuid" is introduced.
With "get_cpuid" it is possible to virtually call the guests
"cpuid"-opcode without changing the VM's context.

[mtosatti: cleanup/beautify code]
Signed-off-by: NStephan Baerwolf <stephan.baerwolf@tu-ilmenau.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bdb42f5a

27 1月, 2012 1 次提交

x86: Properly parenthesize cmpxchg() macro arguments · fc395b92

由 Jan Beulich 提交于 1月 26, 2012

Quite oddly, all of the arguments passed through from the top
level macros to the second level which didn't need parentheses
had them, while the only expression (involving a parameter)
needing them didn't.

Very recently I got bitten by the lack thereof when using
something like "array + index" for the first operand, with
"array" being an array more narrow than int.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F2183A9020000780006F3E6@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

fc395b92

26 1月, 2012 2 次提交

x86/amd: Add missing feature flag for fam15h models 10h-1fh processors · 652847aa

由 Andreas Herrmann 提交于 1月 20, 2012

That is the last one missing for those CPUs.

Others were recently added with commits

 fb215366
 (KVM: expose latest Intel cpu new features (BMI1/BMI2/FMA/AVX2) to guest)

and

 commit 969df4b8
 (x86: Report cpb and eff_freq_ro flags correctly)
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Link: http://lkml.kernel.org/r/20120120163823.GC24508@alberich.amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

652847aa

x86/uv: Fix uv_gpa_to_soc_phys_ram() shift · 5a51467b

由 Russ Anderson 提交于 1月 18, 2012

uv_gpa_to_soc_phys_ram() was inadvertently ignoring the
shift values.  This fix takes the shift into account.
Signed-off-by: NRuss Anderson <rja@sgi.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/20120119020753.GA7228@sgi.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

5a51467b

20 1月, 2012 1 次提交

x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits · 4f2f81a5

由 H. Peter Anvin 提交于 1月 19, 2012

In checkin

  303395ac x86: Generate system call tables and unistd_*.h from tables

the feature macros in <asm/unistd.h> were unified between 32 and 64
bits.  Unfortunately 32 bits requires __ARCH_WANT_SYS_IPC and this was
inadvertently dropped.
Reported-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/CALLzPKbeXN5gdngo8uYYU8mAow=XhrwBFBhKfG811f37BubQOg@mail.gmail.com

4f2f81a5

17 1月, 2012 4 次提交

x86/UV2: Add accounting for BAU strong nacks · b54bd9be

由 Cliff Wickman 提交于 1月 16, 2012

This patch adds separate accounting of UV2 message "strong
nack's" in the BAU statistics.
Signed-off-by: NCliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116212238.GF5767@sgi.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

b54bd9be

x86/UV2: Work around BAU bug · c5d35d39

由 Cliff Wickman 提交于 1月 16, 2012

This patch implements a workaround for a UV2 hardware bug.
The bug is a non-atomic update of a memory-mapped register. When
hardware message delivery and software message acknowledge occur
simultaneously the pending message acknowledge for the arriving
message may be lost.  This causes the sender's message status to
stay busy.

Part of the workaround is to not acknowledge a completed message
until it is verified that no other message is actually using the
resource that is mistakenly recorded in the completed message.

Part of the workaround is to test for long elapsed time in such
a busy condition, then handle it by using a spare sending
descriptor. The stay-busy condition is eventually timed out by
hardware, and then the original sending descriptor can be
re-used. Most of that logic change is in keeping track of the
current descriptor and the state of the spares.

The occurrences of the workaround are added to the BAU
statistics.
Signed-off-by: NCliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116211947.GC5767@sgi.com
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c5d35d39

x86/UV2: Fix new UV2 hardware by using native UV2 broadcast mode · da87c937

由 Cliff Wickman 提交于 1月 16, 2012

Update the use of the Broadcast Assist Unit on SGI Altix UV2 to
the use of native UV2 mode on new hardware (not the legacy mode).

UV2 native mode has a different format for a broadcast message.
We also need quick differentiaton between UV1 and UV2.
Signed-off-by: NCliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116211750.GA5767@sgi.com
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

da87c937

mce: fix warning messages about static struct mce_device · e032d807

由 Greg Kroah-Hartman 提交于 1月 16, 2012

When suspending, there was a large list of warnings going something like:

	Device 'machinecheck1' does not have a release() function, it is broken and must be fixed

This patch turns the static mce_devices into dynamically allocated, and
properly frees them when they are removed from the system.  It solves
the warning messages on my laptop here.
Reported-by: N"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Tested-by: NDjalal Harouni <tixxdz@opendz.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@amd64.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e032d807

16 1月, 2012 1 次提交

x86: Get rid of dubious one-bit signed bitfield · f1044868

由 Anton Vorontsov 提交于 1月 11, 2012

This very noisy sparse warning appears on almost every file in
the kernel:

  CHECK   init/main.c
  arch/x86/include/asm/thread_info.h:43:55: error: dubious one-bit
  signed bitfield arch/x86/include/asm/thread_info.h:44:46: error:
  dubious one-bit signed bitfield

Sparse is right and this patch changes sig_on_uaccess_error and
uaccess_err flags to unsigned type and thus fixes the warning.
Signed-off-by: NAnton Vorontsov <cbouatmailru@gmail.com>
Acked-by: NAndy Lutomirski <luto@mit.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Dan Carpenter <error27@gmail.com>
Link: http://lkml.kernel.org/r/20120111011146.GA30428@oksana.dev.rtsoft.ruSigned-off-by: NIngo Molnar <mingo@elte.hu>

f1044868

13 1月, 2012 1 次提交

x86: Get rid of 'dubious one-bit signed bitfield' sprase warning · bccd1729

由 Anton Vorontsov 提交于 1月 11, 2012

This very noisy sparse warning appears on almost every file in the
kernel:

  CHECK   init/main.c
  arch/x86/include/asm/thread_info.h:43:55: error: dubious one-bit signed bitfield
  arch/x86/include/asm/thread_info.h:44:46: error: dubious one-bit signed bitfield

This patch changes sig_on_uaccess_error and uaccess_err flags to unsigned
type and thus fixes the warning.
Signed-off-by: NAnton Vorontsov <cbouatmailru@gmail.com>
Acked-by: NAndy Lutomirski <luto@mit.edu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bccd1729

10 1月, 2012 1 次提交

x86, atomic: atomic64_read() take a const pointer · 8030c36d

由 H. Peter Anvin 提交于 1月 09, 2012

atomic64_read() doesn't actually write anything (as far as the C
environment is concerned... the CPU does actually write but that's an
implementation quirk), so it should take a const pointer.

This does NOT mean that it is safe to use atomic64_read() on an object
in readonly storage (it will trap!)
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/20120109165859.1879abda.akpm@linux-foundation.org

8030c36d

08 1月, 2012 1 次提交

x86: Move <asm/asm-offsets.h> from trace_syscalls.c to asm/syscall.h · 72142fd4

由 H. Peter Anvin 提交于 1月 07, 2012

This reverts commit d5e553d6, which
caused large numbers of build warnings on PowerPC.

This moves the #include <asm/asm-offsets.h> to <asm/syscall.h>, which
makes some kind of sense since NR_syscalls is syscalls related.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/20111214181545.6e13bc954cb7ddce9086e861@canb.auug.org.au

72142fd4

07 1月, 2012 4 次提交

x86/PCI: Expand the x86_msi_ops to have a restore MSIs. · 76ccc297

由 Konrad Rzeszutek Wilk 提交于 12月 16, 2011

The MSI restore function will become a function pointer in an
x86_msi_ops struct. It defaults to the implementation in the
io_apic.c and msi.c. We piggyback on the indirection mechanism
introduced by "x86: Introduce x86_msi_ops".

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

76ccc297

x86/PCI: amd: factor out MMCONFIG discovery · 24d25dbf

由 Bjorn Helgaas 提交于 1月 05, 2012

This factors out the AMD native MMCONFIG discovery so we can use it
outside amd_bus.c.

amd_bus.c reads AMD MSRs so it can remove the MMCONFIG area from the
PCI resources.  We may also need the MMCONFIG information to work
around BIOS defects in the ACPI MCFG table.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org       # 2.6.34+
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

24d25dbf

x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus() · 2cd6975a

由 Bjorn Helgaas 提交于 10月 28, 2011

x86 has two kinds of PCI root bus scanning:

(1) ACPI-based, using _CRS resources.  This used pci_create_bus(), not
    pci_scan_bus(), because ACPI hotplug needed to split the
    pci_bus_add_devices() into a separate host bridge .start() method.

    This patch parses the _CRS resources earlier, so we can build a list of
    resources and pass it to pci_create_root_bus().

    Note that as before, we parse the _CRS even if we aren't going to use
    it so we can print it for debugging purposes.

(2) All other, which used either default resources (ioport_resource and
    iomem_resource) or information read from the hardware via amd_bus.c or
    similar.  This used pci_scan_bus().

    This patch converts x86_pci_root_bus_res_quirks() (previously called
    from pcibios_fixup_bus()) to x86_pci_root_bus_resources(), which builds
    a list of resources before we call pci_scan_root_bus().

    We also use x86_pci_root_bus_resources() if we have ACPI but are
    ignoring _CRS.

CC: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

2cd6975a

PCI: Pull PCI 'latency timer' setup up into the core · 96c55900

由 Myron Stowe 提交于 10月 28, 2011

The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).

There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.

No functional change.
Signed-off-by: NMyron Stowe <myron.stowe@redhat.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

96c55900

04 1月, 2012 2 次提交

x86: Fix atomic64_xxx_cx8() functions · ceb7b40b

由 Eric Dumazet 提交于 1月 03, 2012

It appears about all functions in arch/x86/lib/atomic64_cx8_32.S
are wrong in case cmpxchg8b must be restarted, because
LOCK_PREFIX macro defines a label "1" clashing with other local
labels :

1:
	some_instructions
	LOCK_PREFIX
	cmpxchg8b (%ebp)
	jne 1b  / jumps to beginning of LOCK_PREFIX !

A possible fix is to use a magic label "672" in LOCK_PREFIX asm
definition, similar to the "671" one we defined in
LOCK_PREFIX_HERE.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NJan Beulich <JBeulich@suse.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1325608540.2320.103.camel@edumazet-HP-Compaq-6005-Pro-SFF-PCSigned-off-by: NIngo Molnar <mingo@elte.hu>

ceb7b40b

x86: Fix and improve cmpxchg_double{,_local}() · cdcd6298

由 Jan Beulich 提交于 1月 02, 2012

Just like the per-CPU ones they had several
problems/shortcomings:

Only the first memory operand was mentioned in the asm()
operands, and the 2x64-bit version didn't have a memory clobber
while the 2x32-bit one did. The former allowed the compiler to
not recognize the need to re-load the data in case it had it
cached in some register, while the latter was overly
destructive.

The types of the local copies of the old and new values were
incorrect (the types of the pointed-to variables should be used
here, to make sure the respective old/new variable types are
compatible).

The __dummy/__junk variables were pointless, given that local
copies of the inputs already existed (and can hence be used for
discarded outputs).

The 32-bit variant of cmpxchg_double_local() referenced
cmpxchg16b_local().

At once also:

 - change the return value type to what it really is: 'bool'
 - unify 32- and 64-bit variants
 - abstract out the common part of the 'normal' and 'local' variants
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F01F12A020000780006A19B@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

cdcd6298

27 12月, 2011 12 次提交

KVM: x86 emulator: implement RDPMC (0F 33) · 222d21aa

由 Avi Kivity 提交于 11月 10, 2011

Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

222d21aa

KVM: Add generic RDPMC support · 022cd0e8

由 Avi Kivity 提交于 11月 10, 2011

Add a helper function that emulates the RDPMC instruction operation.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

022cd0e8

KVM: Expose a version 2 architectural PMU to a guests · f5132b01

由 Gleb Natapov 提交于 11月 10, 2011

Use perf_events to emulate an architectural PMU, version 2.

Based on PMU version 1 emulation by Avi Kivity.

[avi: adjust for cpuid.c]
[jan: fix anonymous field initialization for older gcc]
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f5132b01

KVM: MMU: move the relevant mmu code to mmu.c · e459e322

由 Xiao Guangrong 提交于 11月 28, 2011

Move the mmu code in kvm_arch_vcpu_init() to kvm_mmu_create()
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e459e322

KVM: expose latest Intel cpu new features (BMI1/BMI2/FMA/AVX2) to guest · fb215366

由 Liu, Jinsong 提交于 11月 28, 2011

Intel latest cpu add 6 new features, refer http://software.intel.com/file/36945
The new feature cpuid listed as below:

1. FMA		CPUID.EAX=01H:ECX.FMA[bit 12]
2. MOVBE	CPUID.EAX=01H:ECX.MOVBE[bit 22]
3. BMI1		CPUID.EAX=07H,ECX=0H:EBX.BMI1[bit 3]
4. AVX2		CPUID.EAX=07H,ECX=0H:EBX.AVX2[bit 5]
5. BMI2		CPUID.EAX=07H,ECX=0H:EBX.BMI2[bit 8]
6. LZCNT	CPUID.EAX=80000001H:ECX.LZCNT[bit 5]

This patch expose these features to guest.
Among them, FMA/MOVBE/LZCNT has already been defined, MOVBE/LZCNT has
already been exposed.

This patch defines BMI1/AVX2/BMI2, and exposes FMA/BMI1/AVX2/BMI2 to guest.
Signed-off-by: NLiu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fb215366

KVM: introduce KVM_MEM_SLOTS_NUM macro · 93a5cef0

由 Xiao Guangrong 提交于 11月 24, 2011

Introduce KVM_MEM_SLOTS_NUM macro to instead of
KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

93a5cef0

KVM: Optimize dirty logging by rmap_write_protect() · 95d4c16c

由 Takuya Yoshikawa 提交于 11月 14, 2011

Currently, write protecting a slot needs to walk all the shadow pages
and checks ones which have a pte mapping a page in it.

The walk is overly heavy when dirty pages in that slot are not so many
and checking the shadow pages would result in unwanted cache pollution.

To mitigate this problem, we use rmap_write_protect() and check only
the sptes which can be reached from gfns marked in the dirty bitmap
when the number of dirty pages are less than that of shadow pages.

This criterion is reasonable in its meaning and worked well in our test:
write protection became some times faster than before when the ratio of
dirty pages are low and was not worse even when the ratio was near the
criterion.

Note that the locking for this write protection becomes fine grained.
The reason why this is safe is descripted in the comments.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

95d4c16c

KVM: MMU: remove KVM host pv mmu support · fb920458

由 Chris Wright 提交于 11月 01, 2011

The host side pv mmu support has been marked for feature removal in
January 2011.  It's not in use, is slower than shadow or hardware
assisted paging, and a maintenance burden.  It's November 2011, time to
remove it.
Signed-off-by: NChris Wright <chrisw@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fb920458

KVM: MMU: improve write flooding detected · a30f47cb

由 Xiao Guangrong 提交于 9月 22, 2011

Detecting write-flooding does not work well, when we handle page written, if
the last speculative spte is not accessed, we treat the page is
write-flooding, however, we can speculative spte on many path, such as pte
prefetch, page synced, that means the last speculative spte may be not point
to the written page and the written page can be accessed via other sptes, so
depends on the Accessed bit of the last speculative spte is not enough

Instead of detected page accessed, we can detect whether the spte is accessed
after it is written, if the spte is not accessed but it is written frequently,
we treat is not a page table or it not used for a long time
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a30f47cb

KVM: MMU: fast prefetch spte on invlpg path · f57f2ef5

由 Xiao Guangrong 提交于 9月 22, 2011

Fast prefetch spte for the unsync shadow page on invlpg path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f57f2ef5

KVM: MMU: do not mark accessed bit on pte write path · d01f8d5e

由 Xiao Guangrong 提交于 9月 22, 2011

In current code, the accessed bit is always set when page fault occurred,
do not need to set it on pte write path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d01f8d5e

KVM: x86: retry non-page-table writing instructions · 1cb3f3ae

由 Xiao Guangrong 提交于 9月 22, 2011

If the emulation is caused by #PF and it is non-page_table writing instruction,
it means the VM-EXIT is caused by shadow page protected, we can zap the shadow
page and retry this instruction directly

The idea is from Avi
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1cb3f3ae

24 12月, 2011 2 次提交

x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS · a31bc327

由 Yinghai Lu 提交于 12月 23, 2011

Currently "nox2apic" boot parameter was not enabling x2apic mode if the cpu,
kernel are all capable of enabling x2apic mode and the OS handover
happened in xapic mode.

However If the bios enabled x2apic prior to OS handover, using "nox2apic"
boot parameter had no effect.

If the boot cpu's apicid is < 255, enable "nox2apic" boot parameter to
disable the x2apic mode setup by the bios. This will enable the kernel to
fallback to xapic mode and bringup only the cpu's which has apic-id < 255.

-v2: fix patch error and two compiling warning
	make disable_x2apic to be __init
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/CAE9FiQUeB-3uxJAMiHsz=uPWoFv5Hg1pVepz7aU6YtqOxMC-=Q@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

a31bc327

x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping · fb209bd8

由 Yinghai Lu 提交于 12月 21, 2011

On some of the recent Intel SNB platforms, by default bios is pre-enabling
x2apic mode in the cpu with out setting up interrupt-remapping.
This case was resulting in the kernel to panic as the cpu is already in
x2apic mode but the OS was not able to enable interrupt-remapping (which
is a pre-req for using x2apic capability).

On these platforms all the apic-ids are < 255 and the kernel can fallback to
xapic mode if the bios has not enabled interrupt-remapping (which is
mostly the case if the bios has not exported interrupt-remapping tables to the
OS).
Reported-by: NBerck E. Nash <flyboy@gmail.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20111222014632.600418637@sbsiddha-desk.sc.intel.comSigned-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

fb209bd8

23 12月, 2011 1 次提交

percpu: Remove irqsafe_cpu_xxx variants · 933393f5

由 Christoph Lameter 提交于 12月 22, 2011

We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state.  That has no material change for x86
and s390 implementations of this_cpu operations.  However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.

-tj: This is part of on-going percpu API cleanup.  For detailed
     discussion of the subject, please refer to the following thread.

     http://thread.gmane.org/gmane.linux.kernel/1222078Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
LKML-Reference: <alpine.DEB.2.00.1112221154380.11787@router.home>

933393f5

22 12月, 2011 3 次提交

cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem · 8a25a2fd

由 Kay Sievers 提交于 12月 21, 2011

This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

8a25a2fd

x86: Add counter when debug stack is used with interrupts enabled · 42181186

由 Steven Rostedt 提交于 12月 16, 2011

Mathieu Desnoyers pointed out a case that can cause issues with
NMIs running on the debug stack:

  int3 -> interrupt -> NMI -> int3

Because the interrupt changes the stack, the NMI will not see that
it preempted the debug stack. Looking deeper at this case,
interrupts only happen when the int3 is from userspace or in
an a location in the exception table (fixup).

  userspace -> int3 -> interurpt -> NMI -> int3

All other int3s that happen in the kernel should be processed
without ever enabling interrupts, as the do_trap() call will
panic the kernel if it is called to process any other location
within the kernel.

Adding a counter around the sections that enable interrupts while
using the debug stack allows the NMI to also check that case.
If the NMI sees that it either interrupted a task using the debug
stack or the debug counter is non-zero, then it will have to
change the IDT table to make the int3 not change stacks (which will
corrupt the stack if it does).

Note, I had to move the debug_usage functions out of processor.h
and into debugreg.h because of the static inlined functions to
inc and dec the debug_usage counter. __get_cpu_var() requires
smp.h which includes processor.h, and would fail to build.

Link: http://lkml.kernel.org/r/1323976535.23971.112.camel@gandalf.stny.rr.comReported-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul Turner <pjt@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

42181186

x86: Keep current stack in NMI breakpoints · 228bdaa9

由 Steven Rostedt 提交于 12月 09, 2011

We want to allow NMI handlers to have breakpoints to be able to
remove stop_machine from ftrace, kprobes and jump_labels. But if
an NMI interrupts a current breakpoint, and then it triggers a
breakpoint itself, it will switch to the breakpoint stack and
corrupt the data on it for the breakpoint processing that it
interrupted.

Instead, have the NMI check if it interrupted breakpoint processing
by checking if the stack that is currently used is a breakpoint
stack. If it is, then load a special IDT that changes the IST
for the debug exception to keep the same stack in kernel context.
When the NMI is done, it puts it back.

This way, if the NMI does trigger a breakpoint, it will keep
using the same stack and not stomp on the breakpoint data for
the breakpoint it interrupted.
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

228bdaa9

21 12月, 2011 1 次提交

perf events: Enable raw event support for Intel unhalted_reference_cycles event · cd09c0c4

由 Stephane Eranian 提交于 12月 11, 2011

This patch adds the encoding and definitions necessary for the
unhalted_reference_cycles event avaialble since Intel Core 2 processors.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1323559734-3488-2-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

cd09c0c4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功