提交 · 9a45f036af363aec1efec08827c825d69c115a9a · openeuler / raspberrypi-kernel

10 5月, 2016 1 次提交

x86/topology: Set x86_max_cores to 1 for CONFIG_SMP=n · 8d415ee2

由 Thomas Gleixner 提交于 5月 10, 2016

Josef reported that the uncore driver trips over with CONFIG_SMP=n because
x86_max_cores is 16 instead of 12.

The reason is, that for SMP=n the extended topology detection is a NOOP and
the cache leaf is used to determine the number of cores. That's wrong in two
aspects:

1) The cache leaf enumerates the maximum addressable number of cores in the
   package, which is obviously not correct

2) UP has no business with topology bits at all.

Make intel_num_cpu_cores() return 1 for CONFIG_SMP=n
Reported-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team <Kernel-team@fb.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/761b4a2a-0332-7954-f030-c6639f949612@fb.com

8d415ee2

22 4月, 2016 1 次提交

x86/cpu/intel: Remove not needed paravirt_enabled() use for F00F work around · fa392794

由 Luis R. Rodriguez 提交于 4月 13, 2016

The X86_BUG_F00F work around is responsible for fixing up the error
generated on attempted F00F exploitation from an OOPS to a SIGILL.

There is no reason why this code should not be allowed to run on
PV guest on a F00F-affected CPU -- it would simply never trigger.
The pv_enabled() check was there only to avoid printing the f00f
workaround, so removing the check is purely a cosmetic change.
Suggested-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andrew.cooper3@citrix.com
Cc: andriy.shevchenko@linux.intel.com
Cc: bigeasy@linutronix.de
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: ffainelli@freebox.fr
Cc: george.dunlap@citrix.com
Cc: glin@suse.com
Cc: jgross@suse.com
Cc: jlee@suse.com
Cc: josh@joshtriplett.org
Cc: julien.grall@linaro.org
Cc: konrad.wilk@oracle.com
Cc: kozerkov@parallels.com
Cc: lenb@kernel.org
Cc: lguest@lists.ozlabs.org
Cc: linux-acpi@vger.kernel.org
Cc: lv.zheng@intel.com
Cc: matt@codeblueprint.co.uk
Cc: mbizon@freebox.fr
Cc: rjw@rjwysocki.net
Cc: robert.moore@intel.com
Cc: rusty@rustcorp.com.au
Cc: tiwai@suse.de
Cc: toshi.kani@hp.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1460592286-300-11-git-send-email-mcgrof@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

fa392794

13 4月, 2016 1 次提交

x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usage · 93984fbd

由 Borislav Petkov 提交于 4月 04, 2016

Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: iommu@lists.linux-foundation.org
Cc: linux-pm@vger.kernel.org
Cc: oprofile-list@lists.sf.net
Link: http://lkml.kernel.org/r/1459801503-15600-8-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

93984fbd

31 3月, 2016 3 次提交

x86/cpufeature: Remove cpu_has_pge · c109bf95

由 Borislav Petkov 提交于 3月 29, 2016

Use static_cpu_has() in __flush_tlb_all() due to the time-sensitivity of
this one.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459266123-21878-10-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

c109bf95

x86/cpufeature: Remove cpu_has_xmm2 · 054efb64

由 Borislav Petkov 提交于 3月 29, 2016

Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Link: http://lkml.kernel.org/r/1459266123-21878-8-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

054efb64

x86/cpufeature: Remove cpu_has_clflush · 906bf7fd

由 Borislav Petkov 提交于 3月 29, 2016

Use the fast variant in the DRM code.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dri-devel@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Link: http://lkml.kernel.org/r/1459266123-21878-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

906bf7fd

29 2月, 2016 1 次提交

x86/topology: Create logical package id · 1f12e32f

由 Thomas Gleixner 提交于 2月 22, 2016

For per package oriented services we must be able to rely on the number of CPU
packages to be within bounds. Create a tracking facility, which

- calculates the number of possible packages depending on nr_cpu_ids after boot

- makes sure that the package id is within the number of possible packages. If
  the apic id is outside we map it to a logical package id if there is enough
  space available.

Provide interfaces for drivers to query the mapping and do translations from
physcial to logical ids.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.541071755@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

1f12e32f

03 2月, 2016 1 次提交

x86/cpu: Convert printk(KERN_<LEVEL> ...) to pr_<level>(...) · 1b74dde7

由 Chen Yucong 提交于 2月 02, 2016

 - Use the more current logging style pr_<level>(...) instead of the old
   printk(KERN_<LEVEL> ...).

 - Convert pr_warning() to pr_warn().
Signed-off-by: NChen Yucong <slaoub@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454384702-21707-1-git-send-email-slaoub@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1b74dde7

30 1月, 2016 1 次提交

x86/cpufeature: Carve out X86_FEATURE_* · cd4d09ec

由 Borislav Petkov 提交于 1月 26, 2016

Move them to a separate header and have the following
dependency:

  x86/cpufeatures.h <- x86/processor.h <- x86/cpufeature.h

This makes it easier to use the header in asm code and not
include the whole cpufeature.h and add guards for asm.
Suggested-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1453842730-28463-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

cd4d09ec

19 12月, 2015 1 次提交

x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros · 362f924b

由 Borislav Petkov 提交于 12月 07, 2015

Those are stupid and code should use static_cpu_has_safe() or
boot_cpu_has() instead. Kill the least used and unused ones.

The remaining ones need more careful inspection before a conversion can
happen. On the TODO.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de
Cc: David Sterba <dsterba@suse.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

362f924b

07 11月, 2015 1 次提交

x86/cpu/intel: Enable X86_FEATURE_NONSTOP_TSC_S3 for Merrifield · 354dbaa7

由 Andy Shevchenko 提交于 10月 08, 2015

The Intel Merrifield SoC is a successor of the Intel MID line of
SoCs. Let's set the neccessary capability for that chip. See commit
c54fdbb2 (x86: Add cpu capability flag X86_FEATURE_NONSTOP_TSC_S3)
for the details.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: http://lkml.kernel.org/r/1444319786-36125-1-git-send-email-andriy.shevchenko@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

354dbaa7

21 7月, 2015 1 次提交

x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume · b51ef52d

由 Laura Abbott 提交于 7月 20, 2015

MSR_IA32_ENERGY_PERF_BIAS is lost after suspend/resume:

	x86_energy_perf_policy -r before

	cpu0: 0x0000000000000006
	cpu1: 0x0000000000000006
	cpu2: 0x0000000000000006
	cpu3: 0x0000000000000006
	cpu4: 0x0000000000000006
	cpu5: 0x0000000000000006
	cpu6: 0x0000000000000006
	cpu7: 0x0000000000000006

	after

	cpu0: 0x0000000000000000
	cpu1: 0x0000000000000006
	cpu2: 0x0000000000000006
	cpu3: 0x0000000000000006
	cpu4: 0x0000000000000006
	cpu5: 0x0000000000000006
	cpu6: 0x0000000000000006
	cpu7: 0x0000000000000006

Resulting in inconsistent energy policy settings across CPUs.

This register is set via init_intel() at bootup. During resume,
the secondary CPUs are brought online again and init_intel() is
callled which re-initializes the register. The boot CPU however
never reinitializes the register.

Add a syscore callback to reinitialize the register for the boot CPU.
Signed-off-by: NLaura Abbott <labbott@fedoraproject.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1437428878-4105-1-git-send-email-labbott@fedoraproject.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

b51ef52d

22 2月, 2015 1 次提交

x86/cpu/intel: Fix trivial typo in intel_tlb_table[] · a927792c

由 Yannick Guerrini 提交于 2月 21, 2015

Change 'ssociative' to 'associative'
Signed-off-by: NYannick Guerrini <yguerrini@tomshardware.fr>
Cc: Borislav Petkov <bp@suse.de>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Chris Bainbridge <chris.bainbridge@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Steven Honeyman <stevenhoneyman@gmail.com>
Cc: trivial@kernel.org
Link: http://lkml.kernel.org/r/1424558510-1420-1-git-send-email-yguerrini@tomshardware.frSigned-off-by: NIngo Molnar <mingo@kernel.org>

a927792c

11 1月, 2015 1 次提交

x86, CPU: Fix trivial printk formatting issues with dmesg · f94fe119

由 Steven Honeyman 提交于 11月 05, 2014

dmesg (from util-linux) currently has two methods for reading the kernel
message ring buffer: /dev/kmsg and syslog(2). Since kernel 3.5.0 kmsg
has been the default, which escapes control characters (e.g. new lines)
before they are shown.

This change means that when dmesg is using /dev/kmsg, a 2 line printk
makes the output messy, because the second line does not get a
timestamp.

For example:

[    0.012863] CPU0: Thermal monitoring enabled (TM1)
[    0.012869] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024
Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4
[    0.012958] Freeing SMP alternatives memory: 28K (ffffffff81d86000 - ffffffff81d8d000)
[    0.014961] dmar: Host address width 39

Because printk.c intentionally escapes control characters, they should
not be there in the first place. This patch fixes two occurrences of
this.
Signed-off-by: NSteven Honeyman <stevenhoneyman@gmail.com>
Link: https://lkml.kernel.org/r/1414856696-8094-1-git-send-email-stevenhoneyman@gmail.com
[ Boris: make cpu_detect_tlb() static, while at it. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

f94fe119

29 10月, 2014 1 次提交

x86: Don't enable F00F workaround on Intel Quark processors · d4e1a0af

由 Dave Jones 提交于 10月 28, 2014

The Intel Quark processor is a part of family 5, but does not have the
F00F bug present in Pentiums of the same family.

Pentiums were models 0 through 8, Quark is model 9.
Signed-off-by: NDave Jones <davej@redhat.com>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Link: http://lkml.kernel.org/r/20141028175753.GA12743@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

d4e1a0af

08 10月, 2014 1 次提交

x86: Add cpu_detect_cache_sizes to init_intel() add Quark legacy_cache() · aece118e

由 Bryan O'Donoghue 提交于 10月 07, 2014

Intel processors which don't report cache information via cpuid(2)
or cpuid(4) need quirk code in the legacy_cache_size callback to
report this data. For Intel that callback is is intel_size_cache().

This patch enables calling of cpu_detect_cache_sizes() inside of
init_intel() and hence the calling of the legacy_cache callback in
intel_size_cache(). Adding this call will ensure that PIII Tualatin
currently in intel_size_cache() and Quark SoC X1000 being added to
intel_size_cache() in this patch will report their respective cache
sizes.

This model of calling cpu_detect_cache_sizes() is consistent with
AMD/Via/Cirix/Transmeta and Centaur.

Also added is a string to idenitfy the Quark as Quark SoC X1000
giving better and more descriptive output via /proc/cpuinfo

Adding cpu_detect_cache_sizes to init_intel() will enable calling
of intel_size_cache() on Intel processors which currently no code
can reach. Therefore this patch will also re-enable reporting
of PIII Tualatin cache size information as well as add
Quark SoC X1000 support.

Comment text and cache flow logic suggested by Thomas Gleixner
Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: davej@redhat.com
Cc: hmh@hmh.eng.br
Link: http://lkml.kernel.org/r/1412641189-12415-3-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

aece118e

24 9月, 2014 1 次提交

x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead · ee1b5b16

由 Bryan O'Donoghue 提交于 9月 24, 2014

Quark x1000 advertises PGE via the standard CPUID method
PGE bits exist in Quark X1000's PTEs. In order to flush
an individual PTE it is necessary to reload CR3 irrespective
of the PTE.PGE bit.

See Quark Core_DevMan_001.pdf section 6.4.11

This bug was fixed in Galileo kernels, unfixed vanilla kernels are expected to
crash and burn on this platform.
Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Borislav Petkov <bp@alien8.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1411514784-14885-1-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NIngo Molnar <mingo@kernel.org>

ee1b5b16

31 7月, 2014 1 次提交

x86/mm: Rip out complicated, out-of-date, buggy TLB flushing · e9f4e0a9

由 Dave Hansen 提交于 7月 31, 2014

I think the flush_tlb_mm_range() code that tries to tune the
flush sizes based on the CPU needs to get ripped out for
several reasons:

1. It is obviously buggy. It uses mm->total_vm to judge the
task's footprint in the TLB. It should certainly be using
some measure of RSS, *NOT* ->total_vm since only resident
memory can populate the TLB.
2. Haswell, and several other CPUs are missing from the
intel_tlb_flushall_shift_set() function. Thus, it has been
demonstrated to bitrot quickly in practice.
3. It is plain wrong in my vm:
[ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] tlb_flushall_shift: 6
Which leads to it to never use invlpg.
4. The assumptions about TLB refill costs are wrong:
http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com
(more on this in later patches)
5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59
I believe the sample times were too short. Running the
benchmark in a loop yields times that vary quite a bit.

Note that this leaves us with a static ceiling of 1 page. This
is a conservative, dumb setting, and will be revised in a later
patch.

This also removes the code which attempts to predict whether we
are flushing data or instructions. We expect instruction flushes
to be relatively rare and not worth tuning for explicitly.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.comAcked-by: NRik van Riel <riel@redhat.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

e9f4e0a9

23 7月, 2014 1 次提交

x86, cpu: Fix cache topology for early P4-SMT · 2a226155

由 Peter Zijlstra 提交于 7月 22, 2014

P4 systems with cpuid level < 4 can have SMT, but the cache topology
description available (cpuid2) does not include SMP information.

Now we know that SMT shares all cache levels, and therefore we can
mark all available cache levels as shared.

We do this by setting cpu_llc_id to ->phys_proc_id, since that's
the same for each SMT thread. We can do this unconditional since if
there's no SMT its still true, the one CPU shares cache with only
itself.

This fixes a problem where such CPUs report an incorrect LLC CPU mask.

This in turn fixes a crash in the scheduler where the topology was
build wrong, it assumes the LLC mask to include at least the SMT CPUs.

Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: NBruno Wolff III <bruno@wolff.to>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140722133514.GM12054@laptop.lanSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

2a226155

19 6月, 2014 1 次提交

x86, cpufeature: Convert more "features" to bugs · 9b13a93d

由 Borislav Petkov 提交于 6月 18, 2014

X86_FEATURE_FXSAVE_LEAK, X86_FEATURE_11AP and
X86_FEATURE_CLFLUSH_MONITOR are not really features but synthetic bits
we use for applying different bug workarounds. Call them what they
really are, and make sure they get the proper cross-CPU behavior (OR
rather than AND).
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1403042783-23278-1-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9b13a93d

21 3月, 2014 1 次提交

x86, cpu: Add forcepae parameter for booting PAE kernels on PAE-disabled Pentium M · 69f2366c

由 Chris Bainbridge 提交于 3月 07, 2014

Many Pentium M systems disable PAE but may have a functionally usable PAE
implementation. This adds the "forcepae" parameter which bypasses the boot
check for PAE, and sets the CPU as being PAE capable. Using this parameter
will taint the kernel with TAINT_CPU_OUT_OF_SPEC.
Signed-off-by: NChris Bainbridge <chris.bainbridge@gmail.com>
Link: http://lkml.kernel.org/r/20140307114040.GA4997@localhostAcked-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

69f2366c

14 3月, 2014 2 次提交

x86, intel: Make MSR_IA32_MISC_ENABLE bit constants systematic · 0b131be8

由 H. Peter Anvin 提交于 3月 13, 2014

Replace somewhat arbitrary constants for bits in MSR_IA32_MISC_ENABLE
with verbose but systematic ones.  Add _BIT defines for all the rest
of them, too.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

0b131be8

x86, Intel: Convert to the new bit access MSR accessors · c0a639ad

由 Borislav Petkov 提交于 3月 09, 2014

... and save some lines of code.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394384725-10796-4-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

c0a639ad

28 2月, 2014 1 次提交

x86, platforms: Remove NUMAQ · b5660ba7

由 H. Peter Anvin 提交于 2月 25, 2014

The NUMAQ support seems to be unmaintained, remove it.

Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: David Rientjes <rientjes@google.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/n/530CFD6C.7040705@zytor.com

b5660ba7

25 1月, 2014 2 次提交

mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge · b9a3b4c9

由 Mel Gorman 提交于 1月 21, 2014

There was a large ebizzy performance regression that was
bisected to commit 611ae8e3 (x86/tlb: enable tlb flush range
support for x86).  The problem was related to the
tlb_flushall_shift tuning for IvyBridge which was altered.  The
problem is that it is not clear if the tuning values for each
CPU family is correct as the methodology used to tune the values
is unclear.

This patch uses a conservative tlb_flushall_shift value for all
CPU families except IvyBridge so the decision can be revisited
if any regression is found as a result of this change.
IvyBridge is an exception as testing with one methodology
determined that the value of 2 is acceptable.  Details are in
the changelog for the patch "x86: mm: Change tlb_flushall_shift
for IvyBridge".

One important aspect of this to watch out for is Xen.  The
original commit log mentioned large performance gains on Xen.
It's possible Xen is more sensitive to this value if it flushes
small ranges of pages more frequently than workloads on bare
metal typically do.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Tested-by: NDavidlohr Bueso <davidlohr@hp.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-dyzMww3fqugnhbhgo6Gxmtkw@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

b9a3b4c9

x86: mm: change tlb_flushall_shift for IvyBridge · f98b7a77

由 Mel Gorman 提交于 1月 21, 2014

There was a large performance regression that was bisected to
commit 611ae8e3 ("x86/tlb: enable tlb flush range support for
x86").  This patch simply changes the default balance point
between a local and global flush for IvyBridge.

In the interest of allowing the tests to be reproduced, this
patch was tested using mmtests 0.15 with the following
configurations

	configs/config-global-dhp__tlbflush-performance
	configs/config-global-dhp__scheduler-performance
	configs/config-global-dhp__network-performance

Results are from two machines

Ivybridge   4 threads:  Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz
Ivybridge   8 threads:  Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

Page fault microbenchmark showed nothing interesting.

Ebizzy was configured to run multiple iterations and threads.
Thread counts ranged from 1 to NR_CPUS*2. For each thread count,
it ran 100 iterations and each iteration lasted 10 seconds.

Ivybridge 4 threads
                    3.13.0-rc7            3.13.0-rc7
                       vanilla           altshift-v3
Mean   1     6395.44 (  0.00%)     6789.09 (  6.16%)
Mean   2     7012.85 (  0.00%)     8052.16 ( 14.82%)
Mean   3     6403.04 (  0.00%)     6973.74 (  8.91%)
Mean   4     6135.32 (  0.00%)     6582.33 (  7.29%)
Mean   5     6095.69 (  0.00%)     6526.68 (  7.07%)
Mean   6     6114.33 (  0.00%)     6416.64 (  4.94%)
Mean   7     6085.10 (  0.00%)     6448.51 (  5.97%)
Mean   8     6120.62 (  0.00%)     6462.97 (  5.59%)

Ivybridge 8 threads
                     3.13.0-rc7            3.13.0-rc7
                        vanilla           altshift-v3
Mean   1      7336.65 (  0.00%)     7787.02 (  6.14%)
Mean   2      8218.41 (  0.00%)     9484.13 ( 15.40%)
Mean   3      7973.62 (  0.00%)     8922.01 ( 11.89%)
Mean   4      7798.33 (  0.00%)     8567.03 (  9.86%)
Mean   5      7158.72 (  0.00%)     8214.23 ( 14.74%)
Mean   6      6852.27 (  0.00%)     7952.45 ( 16.06%)
Mean   7      6774.65 (  0.00%)     7536.35 ( 11.24%)
Mean   8      6510.50 (  0.00%)     6894.05 (  5.89%)
Mean   12     6182.90 (  0.00%)     6661.29 (  7.74%)
Mean   16     6100.09 (  0.00%)     6608.69 (  8.34%)

Ebizzy hits the worst case scenario for TLB range flushing every
time and it shows for these Ivybridge CPUs at least that the
default choice is a poor on.  The patch addresses the problem.

Next was a tlbflush microbenchmark written by Alex Shi at
http://marc.info/?l=linux-kernel&m=133727348217113 .  It
measures access costs while the TLB is being flushed.  The
expectation is that if there are always full TLB flushes that
the benchmark would suffer and it benefits from range flushing

There are 320 iterations of the test per thread count.  The
number of entries is randomly selected with a min of 1 and max
of 512.  To ensure a reasonably even spread of entries, the full
range is broken up into 8 sections and a random number selected
within that section.

iteration 1, random number between 0-64
iteration 2, random number between 64-128 etc

This is still a very weak methodology.  When you do not know
what are typical ranges, random is a reasonable choice but it
can be easily argued that the opimisation was for smaller ranges
and an even spread is not representative of any workload that
matters.  To improve this, we'd need to know the probability
distribution of TLB flush range sizes for a set of workloads
that are considered "common", build a synthetic trace and feed
that into this benchmark.  Even that is not perfect because it
would not account for the time between flushes but there are
limits of what can be reasonably done and still be doing
something useful.  If a representative synthetic trace is
provided then this benchmark could be revisited and the shift values retuned.

Ivybridge 4 threads
                        3.13.0-rc7            3.13.0-rc7
                           vanilla           altshift-v3
Mean       1       10.50 (  0.00%)       10.50 (  0.03%)
Mean       2       17.59 (  0.00%)       17.18 (  2.34%)
Mean       3       22.98 (  0.00%)       21.74 (  5.41%)
Mean       5       47.13 (  0.00%)       46.23 (  1.92%)
Mean       8       43.30 (  0.00%)       42.56 (  1.72%)

Ivybridge 8 threads
                         3.13.0-rc7            3.13.0-rc7
                            vanilla           altshift-v3
Mean       1         9.45 (  0.00%)        9.36 (  0.93%)
Mean       2         9.37 (  0.00%)        9.70 ( -3.54%)
Mean       3         9.36 (  0.00%)        9.29 (  0.70%)
Mean       5        14.49 (  0.00%)       15.04 ( -3.75%)
Mean       8        41.08 (  0.00%)       38.73 (  5.71%)
Mean       13       32.04 (  0.00%)       31.24 (  2.49%)
Mean       16       40.05 (  0.00%)       39.04 (  2.51%)

For both CPUs, average access time is reduced which is good as
this is the benchmark that was used to tune the shift values in
the first place albeit it is now known *how* the benchmark was
used.

The scheduler benchmarks were somewhat inconclusive.  They
showed gains and losses and makes me reconsider how stable those
benchmarks really are or if something else might be interfering
with the test results recently.

Network benchmarks were inconclusive.  Almost all results were
flat except for netperf-udp tests on the 4 thread machine.
These results were unstable and showed large variations between
reboots.  It is unknown if this is a recent problems but I've
noticed before that netperf-udp results tend to vary.

Based on these results, changing the default for Ivybridge seems
like a logical choice.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Tested-by: NDavidlohr Bueso <davidlohr@hp.com>
Reviewed-by: NAlex Shi <alex.shi@linaro.org>
Reviewed-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-cqnadffh1tiqrshthRj3Esge@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

f98b7a77

13 1月, 2014 1 次提交

sched/clock, x86: Use a static_key for sched_clock_stable · 35af99e6

由 Peter Zijlstra 提交于 11月 28, 2013

In order to avoid the runtime condition and variable load turn
sched_clock_stable into a static_key.

Also provide a shorter implementation of local_clock() and
cpu_clock(int) when sched_clock_stable==1.

                        MAINLINE   PRE       POST

    sched_clock_stable: 1          1         1
    (cold) sched_clock: 329841     221876    215295
    (cold) local_clock: 301773     234692    220773
    (warm) sched_clock: 38375      25602     25659
    (warm) local_clock: 100371     33265     27242
    (warm) rdtsc:       27340      24214     24208
    sched_clock_stable: 0          0         0
    (cold) sched_clock: 382634     235941    237019
    (cold) local_clock: 396890     297017    294819
    (warm) sched_clock: 38194      25233     25609
    (warm) local_clock: 143452     71234     71232
    (warm) rdtsc:       27345      24245     24243
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

35af99e6

07 1月, 2014 1 次提交

x86: Delete non-required instances of include <linux/init.h> · 663b55b9

由 Paul Gortmaker 提交于 1月 06, 2014

None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

[ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ]
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

663b55b9

04 1月, 2014 1 次提交

x86, cpu: Detect more TLB configuration · dd360393

由 Kirill A. Shutemov 提交于 12月 23, 2013

The Intel Software Developer’s Manual covers few more TLB
configurations exposed as CPUID 2 descriptors:

61H Instruction TLB: 4 KByte pages, fully associative, 48 entries
63H Data TLB: 1 GByte pages, 4-way set associative, 4 entries
76H Instruction TLB: 2M/4M pages, fully associative, 8 entries
B5H Instruction TLB: 4KByte pages, 8-way set associative, 64 entries
B6H Instruction TLB: 4KByte pages, 8-way set associative, 128 entries
C1H Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries
C2H DTLB DTLB: 2 MByte/$MByte pages, 4-way associative, 16 entries

Let's detect them as well.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/r/1387801018-14499-1-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

dd360393

20 12月, 2013 1 次提交

x86 idle: Repair large-server 50-watt idle-power regression · 40e2d7f9

由 Len Brown 提交于 12月 18, 2013

Linux 3.10 changed the timing of how thread_info->flags is touched:

	x86: Use generic idle loop
	(7d1a9417)

This caused Intel NHM-EX and WSM-EX servers to experience a large number
of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote
from deep C-states to shallow C-states, which caused these platforms
to experience a significant increase in idle power.

Note that this issue was already present before the commit above,
however, it wasn't seen often enough to be noticed in power measurements.

Here we extend an errata workaround from the Core2 EX "Dunnington"
to extend to NHM-EX and WSM-EX, to prevent these immediate
returns from MWAIT, reducing idle power on these platforms.

While only acpi_idle ran on Dunnington, intel_idle
may also run on these two newer systems.
As of today, there are no other models that are known
to need this tweak.

Link: http://lkml.kernel.org/r/CAJvTdK=%2BaNN66mYpCGgbHGCHhYQAKx-vB0kJSWjVpsNb_hOAtQ@mail.gmail.comSigned-off-by: NLen Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com
Cc: <stable@vger.kernel.org> # 3.12.x, 3.11.x, 3.10.x
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

40e2d7f9

26 10月, 2013 1 次提交

x86/cpu: Track legacy CPU model data only on 32-bit kernels · 09dc68d9

由 Jan Beulich 提交于 10月 21, 2013

struct cpu_dev's c_models is only ever set inside CONFIG_X86_32
conditionals (or code that's being built for 32-bit only), so
there's no use of reserving the (empty) space for the model
names in a 64-bit kernel.

Similarly, c_size_cache is only used in the #else of a
CONFIG_X86_64 conditional, so reserving space for (and in one
case even initializing) that field is pointless for 64-bit
kernels too.

While moving both fields to the end of the structure, I also
noticed that:

 - the c_models array size was one too small, potentially causing
   table_lookup_model() to return garbage on Intel CPUs (intel.c's
   instance was lacking the sentinel with family being zero), so the
   patch bumps that by one,

 - c_models' vendor sub-field was unused (and anyway redundant
   with the base structure's c_x86_vendor field), so the patch deletes it.

Also rename the legacy fields so that their legacy nature stands out
and comment their declarations.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/5265036802000078000FC4DB@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

09dc68d9

15 7月, 2013 1 次提交

x86: delete __cpuinit usage from all x86 files · 148f9bb8

由 Paul Gortmaker 提交于 6月 18, 2013

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files.  x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

148f9bb8

12 4月, 2013 1 次提交

x86: Use a read-only IDT alias on all CPUs · 4eefbe79

由 Kees Cook 提交于 4月 10, 2013

Make a copy of the IDT (as seen via the "sidt" instruction) read-only.
This primarily removes the IDT from being a target for arbitrary memory
write attacks, and has the added benefit of also not leaking the kernel
base offset, if it has been relocated.

We already did this on vendor == Intel and family == 5 because of the
F0 0F bug -- regardless of if a particular CPU had the F0 0F bug or
not.  Since the workaround was so cheap, there simply was no reason to
be very specific.  This patch extends the readonly alias to all CPUs,
but does not activate the #PF to #UD conversion code needed to deliver
the proper exception in the F0 0F case except on Intel family 5
processors.
Signed-off-by: NKees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20130410192422.GA17344@www.outflux.net
Cc: Eric Northup <digitaleric@google.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

4eefbe79

03 4月, 2013 1 次提交

x86, cpu: Convert F00F bug detection · e2604b49

由 Borislav Petkov 提交于 3月 20, 2013

... to using the new facility and drop the cpuinfo_x86 member.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1363788448-31325-3-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

e2604b49

16 3月, 2013 1 次提交

x86: Add cpu capability flag X86_FEATURE_NONSTOP_TSC_S3 · c54fdbb2

由 Feng Tang 提交于 3月 12, 2013

On some new Intel Atom processors (Penwell and Cloverview), there is
a feature that the TSC won't stop in S3 state, say the TSC value
won't be reset to 0 after resume. This feature makes TSC a more reliable
clocksource and could benefit the timekeeping code during system
suspend/resume cycle, so add a flag for it.
Signed-off-by: NFeng Tang <feng.tang@intel.com>
[jstultz: Fix checkpatch warning]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

c54fdbb2

30 11月, 2012 1 次提交

x86, 386 removal: Remove CONFIG_INVLPG · 094ab1db

由 H. Peter Anvin 提交于 11月 28, 2012

All 486+ CPUs support INVLPG, so remove the fallback 386 support
code.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1354132230-21854-6-git-send-email-hpa@linux.intel.com

094ab1db

18 11月, 2012 1 次提交

x86, mm: kill numa_64.h · c074eaac

由 Yinghai Lu 提交于 11月 16, 2012

Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1353123563-3103-44-git-send-email-yinghai@kernel.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

c074eaac

17 11月, 2012 1 次提交

x86: Use __pa_symbol instead of __pa on C visible symbols · fc8d7826

由 Alexander Duyck 提交于 11月 16, 2012

When I made an attempt at separating __pa_symbol and __pa I found that there
were a number of cases where __pa was used on an obvious symbol.

I also caught one non-obvious case as _brk_start and _brk_end are based on the
address of __brk_base which is a C visible symbol.

In mark_rodata_ro I was able to reduce the overhead of kernel symbol to
virtual memory translation by using a combination of __va(__pa_symbol())
instead of page_address(virt_to_page()).
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Link: http://lkml.kernel.org/r/20121116215640.8521.80483.stgit@ahduyck-cp1.jf.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

fc8d7826

07 8月, 2012 1 次提交

x86, cpu: Push TLB detection CPUID check down · 5b556332

由 Borislav Petkov 提交于 8月 06, 2012

Push the max CPUID leaf check into the ->detect_tlb function and remove
general test case from the generic path.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1344272439-29080-3-git-send-email-bp@amd64.orgAcked-by: NAlex Shi <alex.shi@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

5b556332

28 6月, 2012 1 次提交

x86/tlb: add tlb_flushall_shift for specific CPU · c4211f42

由 Alex Shi 提交于 6月 28, 2012

Testing show different CPU type(micro architectures and NUMA mode) has
different balance points between the TLB flush all and multiple invlpg.
And there also has cases the tlb flush change has no any help.

This patch give a interface to let x86 vendor developers have a chance
to set different shift for different CPU type.

like some machine in my hands, balance points is 16 entries on
Romely-EP; while it is at 8 entries on Bloomfield NHM-EP; and is 256 on
IVB mobile CPU. but on model 15 core2 Xeon using invlpg has nothing
help.

For untested machine, do a conservative optimization, same as NHM CPU.
Signed-off-by: NAlex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-5-git-send-email-alex.shi@intel.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

c4211f42