提交 · b716916679e72054d436afadce2f94dcad71cfad · openeuler / Kernel

10 10月, 2011 6 次提交

perf, x86: Implement IBS initialization · b7169166

由 Robert Richter 提交于 9月 21, 2011

This patch implements IBS feature detection and initialzation. The
code is shared between perf and oprofile. If IBS is available on the
system for perf, a pmu is setup.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1316597423-25723-3-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

b7169166

perf, x86: Share IBS macros between perf and oprofile · ee5789db

由 Robert Richter 提交于 9月 21, 2011

Moving IBS macros from oprofile to <asm/perf_event.h> to make it
available to perf. No additional changes.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1316597423-25723-2-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

ee5789db

x86, nmi: Add in logic to handle multiple events and unknown NMIs · b227e233

由 Don Zickus 提交于 9月 30, 2011

Previous patches allow the NMI subsystem to process multipe NMI events
in one NMI. As previously discussed this can cause issues when an event
triggered another NMI but is processed in the current NMI. This causes the
next NMI to go unprocessed and become an 'unknown' NMI.

To handle this, we first have to flag whether or not the NMI handler handled
more than one event or not. If it did, then there exists a chance that
the next NMI might be already processed. Once the NMI is flagged as a
candidate to be swallowed, we next look for a back-to-back NMI condition.

This is determined by looking at the %rip from pt_regs. If it is the same
as the previous NMI, it is assumed the cpu did not have a chance to jump
back into a non-NMI context and execute code and instead handled another NMI.

If both of those conditions are true then we will swallow any unknown NMI.

There still exists a chance that we accidentally swallow a real unknown NMI,
but for now things seem better.

An optimization has also been added to the nmi notifier rountine. Because x86
can latch up to one NMI while currently processing an NMI, we don't have to
worry about executing _all_ the handlers in a standalone NMI. The idea is
if multiple NMIs come in, the second NMI will represent them. For those
back-to-back NMI cases, we have the potentail to drop NMIs. Therefore only
execute all the handlers in the second half of a detected back-to-back NMI.
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

b227e233

x86, nmi: Wire up NMI handlers to new routines · 9c48f1c6

由 Don Zickus 提交于 9月 30, 2011

Just convert all the files that have an nmi handler to the new routines.
Most of it is straight forward conversion.  A couple of places needed some
tweaking like kgdb which separates the debug notifier from the nmi handler
and mce removes a call to notify_die.

[Thanks to Ying for finding out the history behind that mce call

https://lkml.org/lkml/2010/5/27/114

And Boris responding that he would like to remove that call because of it

https://lkml.org/lkml/2011/9/21/163]

The things that get converted are the registeration/unregistration routines
and the nmi handler itself has its args changed along with code removal
to check which list it is on (most are on one NMI list except for kgdb
which has both an NMI routine and an NMI Unknown routine).
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NCorey Minyard <minyard@acm.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

9c48f1c6

x86, nmi: Create new NMI handler routines · c9126b2e

由 Don Zickus 提交于 9月 30, 2011

The NMI handlers used to rely on the notifier infrastructure. This worked
great until we wanted to support handling multiple events better.

One of the key ideas to the nmi handling is to process _all_ the handlers for
each NMI. The reason behind this switch is because NMIs are edge triggered.
If enough NMIs are triggered, then they could be lost because the cpu can
only latch at most one NMI (besides the one currently being processed).

In order to deal with this we have decided to process all the NMI handlers
for each NMI. This allows the handlers to determine if they recieved an
event or not (the ones that can not determine this will be left to fend
for themselves on the unknown NMI list).

As a result of this change it is now possible to have an extra NMI that
was destined to be received for an already processed event. Because the
event was processed in the previous NMI, this NMI gets dropped and becomes
an 'unknown' NMI. This of course will cause printks that scare people.

However, we prefer to have extra NMIs as opposed to losing NMIs and as such
are have developed a basic mechanism to catch most of them. That will be
a later patch.

To accomplish this idea, I unhooked the nmi handlers from the notifier
routines and created a new mechanism loosely based on doIRQ. The reason
for this is the notifier routines have a couple of shortcomings. One we
could't guarantee all future NMI handlers used NOTIFY_OK instead of
NOTIFY_STOP. Second, we couldn't keep track of the number of events being
handled in each routine (most only handle one, perf can handle more than one).
Third, I wanted to eventually display which nmi handlers are registered in
the system in /proc/interrupts to help see who is generating NMIs.

The patch below just implements the new infrastructure but doesn't wire it up
yet (that is the next patch). Its design is based on doIRQ structs and the
atomic notifier routines. So the rcu stuff in the patch isn't entirely untested
(as the notifier routines have soaked it) but it should be double checked in
case I copied the code wrong.
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

c9126b2e

perf, intel: Use GO/HO bits in perf-ctr · 144d31e6

由 Gleb Natapov 提交于 10月 05, 2011

Intel does not have guest/host-only bit in perf counters like AMD
does.  To support GO/HO bits KVM needs to switch EVENTSELn values
(or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
configured to count only in a guest mode it stays disabled in a host,
but VMX is configured to switch it to enabled value during guest entry.

This patch adds GO/HO tracking to Intel perf code and provides interface
for KVM to get a list of MSRs that need to be switched on a guest entry.

Only cpus with architectural PMU (v1 or later) are supported with this
patch.  To my knowledge there is not p6 models with VMX but without
architectural PMU and p4 with VMX are rare and the interface is general
enough to support them if need arise.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-7-git-send-email-gleb@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

144d31e6

06 10月, 2011 1 次提交

perf, amd: Use GO/HO bits in perf-ctr · 011af857

由 Joerg Roedel 提交于 10月 05, 2011

The AMD perf-counters support counting in guest or host-mode
only. Make use of that feature when user-space specified
guest/host-mode only counting.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-3-git-send-email-gleb@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

011af857

16 9月, 2011 1 次提交

asm alternatives: remove incorrect alignment notes · a7f934d4

由 Linus Torvalds 提交于 9月 15, 2011

On x86-64, they were just wasteful: with the explicitly added (now
unnecessary) padding, the size of the alternatives structure was 16
bytes, and an alignment of 8 bytes didn't hurt much.

However, it was still silly, since the natural size and alignment for
the structure is actually just 12 bytes, 4-byte aligned since commit
59e97e4d ("x86: Make alternative instruction pointers relative").
So removing the padding, and removing the extra alignment is just a good
idea.

On x86-32, the alignment of 4 bytes was correct, but was incorrectly
hardcoded as 8 bytes in <asm/alternative-asm.h>.  That header file had
used to be an x86-64 only header file, but various unification efforts
have made it be used for x86-32 too (ie the unification of rwlock and
rwsem).

That in turn caused x86-32 boot failures, because the extra alignment
would result in random zero-filled words in the altinstructions section,
causing oopses early at boot when doing alternative instruction
replacement.

So just remove all the alignment noise entirely.  It's wrong, and it's
unnecessary.  The section itself is already properly aligned by the
linker scripts, and all additions to the section had better be of the
proper 12-byte format, keeping it aligned.  So if the align directive
were to ever make a difference, that would be an indication of a serious
bug to begin with.
Reported-by: NWerner Landgraf <w.landgraf@ru.r>
Acked-by: NAndrew Lutomirski <luto@mit.edu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7f934d4

30 8月, 2011 1 次提交

KVM: Fix instruction size issue in pvclock scaling · 3b217116

由 Duncan Sands 提交于 8月 30, 2011

Commit de2d1a52 ("KVM: Fix register corruption in pvclock_scale_delta")
introduced a mul instruction that may have only a memory operand; the
assembler therefore cannot select the correct size:

   pvclock.s:229: Error: no instruction mnemonic suffix given and no register
operands; can't size instruction

In this example the assembler is:

         #APP
         mul -48(%rbp) ; shrd $32, %rdx, %rax
         #NO_APP

A simple solution is to use mulq.
Signed-off-by: NDuncan Sands <baldrick@free.fr>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3b217116

27 8月, 2011 1 次提交

All Arch: remove linkage for sys_nfsservctl system call · f5b94099

由 NeilBrown 提交于 8月 26, 2011

The nfsservctl system call is now gone, so we should remove all
linkage for it.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5b94099

17 8月, 2011 1 次提交

xen/x86: replace order-based range checking of M2P table by linear one · ccbcdf7c

由 Jan Beulich 提交于 8月 16, 2011

The order-based approach is not only less efficient (requiring a shift
and a compare, typical generated code looking like this

	mov	eax, [machine_to_phys_order]
	mov	ecx, eax
	shr	ebx, cl
	test	ebx, ebx
	jnz	...

whereas a direct check requires just a compare, like in

	cmp	ebx, [machine_to_phys_nr]
	jae	...

), but also slightly dangerous in the 32-on-64 case - the element
address calculation can wrap if the next power of two boundary is
sufficiently far away from the actual upper limit of the table, and
hence can result in user space addresses being accessed (with it being
unknown what may actually be mapped there).

Additionally, the elimination of the mistaken use of fls() here (should
have been __fls()) fixes a latent issue on x86-64 that would trigger
if the code was run on a system with memory extending beyond the 44-bit
boundary.

CC: stable@kernel.org
Signed-off-by: NJan Beulich <jbeulich@novell.com>
[v1: Based on Jeremy's feedback]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ccbcdf7c

11 8月, 2011 2 次提交

x86-64: Rework vsyscall emulation and add vsyscall= parameter · 3ae36655

由 Andy Lutomirski 提交于 8月 10, 2011

There are three choices:

vsyscall=native: Vsyscalls are native code that issues the
corresponding syscalls.

vsyscall=emulate (default): Vsyscalls are emulated by instruction
fault traps, tested in the bad_area path.  The actual contents of
the vsyscall page is the same as the vsyscall=native case except
that it's marked NX.  This way programs that make assumptions about
what the code in the page does will not be confused when they read
that code.

vsyscall=none: Trying to execute a vsyscall will segfault.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

3ae36655

x86-64: Wire up getcpu syscall · fce8dc06

由 Andy Lutomirski 提交于 8月 10, 2011

getcpu is available as a vdso entry and an emulated vsyscall.
Programs that for some reason don't want to use the vdso should
still be able to call getcpu without relying on the slow emulated
vsyscall. It costs almost nothing to expose it as a real syscall.

We also need this for the following patch in vsyscall=native mode.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/6b19f55bdb06a0c32c2fa6dba9b6f222e1fde999.1312988155.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

fce8dc06

05 8月, 2011 1 次提交

x86-64: Add user_64bit_mode paravirt op · 318f5a2a

由 Andy Lutomirski 提交于 8月 03, 2011

Three places in the kernel assume that the only long mode CPL 3
selector is __USER_CS.  This is not true on Xen -- Xen's sysretq
changes cs to the magic value 0xe033.

Two of the places are corner cases, but as of "x86-64: Improve
vsyscall emulation CS and RIP handling"
(c9712944), vsyscalls will segfault
if called with Xen's extra CS selector.  This causes a panic when
older init builds die.

It seems impossible to make Xen use __USER_CS reliably without
taking a performance hit on every system call, so this fixes the
tests instead with a new paravirt op.  It's a little ugly because
ptrace.h can't include paravirt.h.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/f4fcb3947340d9e96ce1054a432f183f9da9db83.1312378163.git.luto@mit.eduReported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

318f5a2a

04 8月, 2011 2 次提交

x86: don't include xen/xen.h in <asm/io.h> unless XEN is enabled · 33f35f2a

由 Linus Torvalds 提交于 8月 03, 2011

Dmitry Kasatkin reports:
  "kernel-devel package with kernel headers have no <include/xen>
   directory if XEN is disabled.  Modules which inclide asm/io.h won't
   compile.

   XEN related content is behind the CONFIG_XEN flag in the io.h.  And
   <xen/xen.h> should be also behind CONFIG_XEN flag."

So move the include of <xen/xen.h> down into the section that is
conditional on CONFIG_XEN.
Reported-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33f35f2a

x86 idle: move mwait_idle_with_hints() to where it is used · 4bfc8288

由 Len Brown 提交于 3月 30, 2011

...and make it static

no functional change

cc: x86@kernel.org
Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

4bfc8288

27 7月, 2011 5 次提交

atomic: cleanup asm-generic atomic*.h inclusion · 7847777a

由 Arun Sharma 提交于 7月 26, 2011

After changing all consumers of atomics to include <linux/atomic.h>, we
ran into some compile time errors due to this dependency chain:

linux/atomic.h
  -> asm/atomic.h
    -> asm-generic/atomic-long.h

where atomic-long.h could use funcs defined later in linux/atomic.h
without a prototype.  This patches moves the code that includes
asm-generic/atomic*.h to linux/atomic.h.

Archs that need <asm-generic/atomic64.h> need to select
CONFIG_GENERIC_ATOMIC64 from now on (some of them used to include it
unconditionally).

Compile tested on i386 and x86_64 with allnoconfig.
Signed-off-by: NArun Sharma <asharma@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7847777a

atomic: move atomic_add_unless to generic code · f24219b4

由 Arun Sharma 提交于 7月 26, 2011

This is in preparation for more generic atomic primitives based on
__atomic_add_unless.
Signed-off-by: NArun Sharma <asharma@fb.com>
Signed-off-by: NHans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f24219b4

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

asm-generic: add another generic ext2 atomic bitops · 148817ba

由 Akinobu Mita 提交于 7月 26, 2011

The majority of architectures implement ext2 atomic bitops as
test_and_{set,clear}_bit() without spinlock.

This adds this type of generic implementation in ext2-atomic-setbit.h and
use it wherever possible.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Suggested-by: NAndreas Dilger <adilger@dilger.ca>
Suggested-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

148817ba

ptrace: unify show_regs() prototype · 0e9a6cb5

由 Mike Frysinger 提交于 7月 26, 2011

[ poleg@redhat.com: no need to declare show_regs() in ptrace.h, sched.h does this ]
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0e9a6cb5

24 7月, 2011 4 次提交

KVM: MMU: lockless walking shadow page table · c2a2ac2b

由 Xiao Guangrong 提交于 7月 12, 2011

Use rcu to protect shadow pages table to be freed, so we can safely walk it,
it should run fastly and is needed by mmio page fault
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c2a2ac2b

KVM: MMU: remove bypass_guest_pf · c3707958

由 Xiao Guangrong 提交于 7月 12, 2011

The idea is from Avi:
| Maybe it's time to kill off bypass_guest_pf=1.  It's not as effective as
| it used to be, since unsync pages always use shadow_trap_nonpresent_pte,
| and since we convert between the two nonpresent_ptes during sync and unsync.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c3707958

KVM: MMU: cache mmio info on page fault path · bebb106a

由 Xiao Guangrong 提交于 7月 12, 2011

If the page fault is caused by mmio, we can cache the mmio info, later, we do
not need to walk guest page table and quickly know it is a mmio fault while we
emulate the mmio instruction
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bebb106a

KVM guest: KVM Steal time registration · d910f5c1

由 Glauber Costa 提交于 7月 11, 2011

This patch implements the kvm bits of the steal time infrastructure.
The most important part of it, is the steal time clock. It is an
continuous clock that shows the accumulated amount of steal time
since vcpu creation. It is supposed to survive cpu offlining/onlining.

[marcelo: fix build with CONFIG_KVM_GUEST=n]
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Tested-by: NEric B Munson <emunson@mgebm.net>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Avi Kivity <avi@redhat.com>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d910f5c1

23 7月, 2011 1 次提交

asm-generic: move archictures to common delay.h · a4e05276

由 Jonas Bonn 提交于 7月 02, 2011

This patch moves the in-tree architectures that were using the 'generic'
delay.h over to using the header file in asm-generic.

This is not done using the generic-y mechanism as none of these arch's
have started using that mechanism yet.  This is a trivial change to make
later when the arch begins using generic-y.

Note the subtle change to the avr32 and SH architectures where the argument
to __const_udelay was previously using the rounded down constant value
instead of the rounded up value.
Signed-off-by: NJonas Bonn <jonas@southpole.se>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NHans-Christian Egtvedt <egtvedt@samfundet.no>

a4e05276

22 7月, 2011 4 次提交

lguest: update comments · 9f54288d

由 Rusty Russell 提交于 7月 22, 2011

Also removes a long-unused #define and an extraneous semicolon.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

9f54288d

clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option · ae7bd11b

由 H. Peter Anvin 提交于 7月 21, 2011

The machinery for __ARCH_HAS_CLOCKSOURCE_DATA assumed a file in
asm-generic would be the default for architectures without their own
file in asm/, but that is not how it works.

Replace it with a Kconfig option instead.

Link: http://lkml.kernel.org/r/4E288AA6.7090804@zytor.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Luck <tony.luck@intel.com>

ae7bd11b

x86, perf: Make copy_from_user_nmi() a library function · 1ac2e6ca

由 Robert Richter 提交于 6月 07, 2011

copy_from_user_nmi() is used in oprofile and perf. Moving it to other
library functions like copy_from_user(). As this is x86 code for 32
and 64 bits, create a new file usercopy.c for unified code.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110607172413.GJ20052@erda.amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

1ac2e6ca

x86, perf: P4 PMU - Fix typos in comments and style cleanup · f53173e4

由 Cyrill Gorcunov 提交于 7月 21, 2011

This patch:

 - fixes typos in comments and clarifies the text
 - renames obscure p4_event_alias::original and ::alter members to
   ::original and ::alternative as appropriate
 - drops parenthesis from the return of p4_get_alias_event()

No functional changes.
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/20110721160625.GX7492@sunSigned-off-by: NIngo Molnar <mingo@elte.hu>

f53173e4

21 7月, 2011 3 次提交

x86: Serialize SMP bootup CMOS accesses on rtc_lock · ac619f4e

由 Jan Beulich 提交于 7月 19, 2011

With CPU hotplug, there is a theoretical race between other CMOS
(namely RTC) accesses and those done in the SMP secondary
processor bringup path.

I am unware of the problem having been noticed by anyone in practice,
but it would very likely be rather spurious and very hard to reproduce.
So to be on the safe side, acquire rtc_lock around those accesses.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/4E257AE7020000780004E2FF@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

ac619f4e

x86: Fix write lock scalability 64-bit issue · a750036f

由 Jan Beulich 提交于 7月 19, 2011

With the write lock path simply subtracting RW_LOCK_BIAS there
is, on large systems, the theoretical possibility of overflowing
the 32-bit value that was used so far (namely if 128 or more
CPUs manage to do the subtraction, but don't get to do the
inverse addition in the failure path quickly enough).

A first measure is to modify RW_LOCK_BIAS itself - with the new
value chosen, it is good for up to 2048 CPUs each allowed to
nest over 2048 times on the read path without causing an issue.
Quite possibly it would even be sufficient to adjust the bias a
little further, assuming that allowing for significantly less
nesting would suffice.

However, as the original value chosen allowed for even more
nesting levels, to support more than 2048 CPUs (possible
currently only for 64-bit kernels) the lock itself gets widened
to 64 bits.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4E258E0D020000780004E3F0@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

a750036f

x86: Unify rwlock assembly implementation · 4625cd63

由 Jan Beulich 提交于 7月 19, 2011

Rather than having two functionally identical implementations
for 32- and 64-bit configurations, extend the existing assembly
abstractions enough to fold the two rwlock implementations into
a shared one.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4E258DD7020000780004E3EA@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

4625cd63

19 7月, 2011 1 次提交
- J
  xen/trace: add multicall tracing · c796f213
  由 Jeremy Fitzhardinge 提交于 12月 16, 2010
```
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
```
  c796f213
15 7月, 2011 4 次提交

x86-64: Move vread_tsc and vread_hpet into the vDSO · 98d0ac38

由 Andy Lutomirski 提交于 7月 14, 2011

The vsyscall page now consists entirely of trap instructions.

Cc: John Stultz <johnstul@us.ibm.com>
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

98d0ac38

x86, msr: Fix typo in ENERGY_PERF_BIAS_POWERSAVE · 4bb82178

由 H. Peter Anvin 提交于 7月 14, 2011

Fix a trivial typo in the name of the constant
ENERGY_PERF_BIAS_POWERSAVE.  This didn't cause trouble because this
constant is not currently used for anything.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/tip-abe48b108247e9b90b4c6739662a2e5c765ed114@git.kernel.org

4bb82178

perf, x86: P4 PMU - Introduce event alias feature · f9129870

由 Cyrill Gorcunov 提交于 7月 09, 2011

Instead of hw_nmi_watchdog_set_attr() weak function
and appropriate x86_pmu::hw_watchdog_set_attr() call
we introduce even alias mechanism which allow us
to drop this routines completely and isolate quirks
of Netburst architecture inside P4 PMU code only.

The main idea remains the same though -- to allow
nmi-watchdog and perf top run simultaneously.

Note the aliasing mechanism applies to generic
PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
event (say passed as RAW initially) might have some
additional bits set inside ESCR register changing
the behaviour of event and we can't guarantee anymore
that alias event will give the same result.

P.S. Thanks a huge to Don and Steven for for testing
     and early review.
Acked-by: NDon Zickus <dzickus@redhat.com>
Tested-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Stephane Eranian <eranian@google.com>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20110708201712.GS23657@sunSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>

f9129870

x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS · abe48b10

由 Len Brown 提交于 7月 14, 2011

Since 2.6.36 (23016bf0), Linux prints the existence of "epb" in /proc/cpuinfo,
Since 2.6.38 (d5532ee7), the x86_energy_perf_policy(8) utility has
been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS.

However, the typical BIOS fails to initialize the MSR, presumably
because this is handled by high-volume shrink-wrap operating systems...

Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8).
As a result, WSM-EP, SNB, and later hardware from Intel will run in its
default hardware power-on state (performance), which assumes that users
care for performance at all costs and not for energy efficiency.
While that is fine for performance benchmarks, the hardware's intended default
operating point is "normal" mode...

Initialize the MSR to the "normal" by default during kernel boot.

x86_energy_perf_policy(8) is available to change the default after boot,
should the user have a different preference.
Signed-off-by: NLen Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>

abe48b10

14 7月, 2011 2 次提交

KVM guest: Add a pv_ops stub for steal time · 3c404b57

由 Glauber Costa 提交于 7月 11, 2011

This patch adds a function pointer in one of the many paravirt_ops
structs, to allow guests to register a steal time function. Besides
a steal time function, we also declare two jump_labels. They will be
used to allow the steal time code to be easily bypassed when not
in use.
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Tested-by: NEric B Munson <emunson@mgebm.net>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3c404b57

KVM: Steal time implementation · c9aaa895

由 Glauber Costa 提交于 7月 11, 2011

To implement steal time, we need the hypervisor to pass the guest
information about how much time was spent running other processes
outside the VM, while the vcpu had meaningful work to do - halt
time does not count.

This information is acquired through the run_delay field of
delayacct/schedstats infrastructure, that counts time spent in a
runqueue but not running.

Steal time is a per-cpu information, so the traditional MSR-based
infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the
memory area address containing information about steal time

This patch contains the hypervisor part of the steal time infrasructure,
and can be backported independently of the guest portion.

[avi, yongjie: export delayacct_on, to avoid build failures in some configs]
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Tested-by: NEric B Munson <emunson@mgebm.net>
CC: Rik van Riel <riel@redhat.com>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NYongjie Ren <yongjie.ren@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c9aaa895

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功