提交 · 62a3207b8cf3de35368cdc3822b30b82d59eea95 · openeuler / raspberrypi-kernel

18 8月, 2009 1 次提交

x86, intel_txt: Handle ACPI_SLEEP without X86_TRAMPOLINE · 62a3207b

由 H. Peter Anvin 提交于 8月 17, 2009

On 32 bits, we can have CONFIG_ACPI_SLEEP set without implying
CONFIG_X86_TRAMPOLINE.  In that case, we simply do not need to mark
the trampoline as a MAC region.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Shane Wang <shane.wang@intel.com>
Cc: Joseph Cihula <joseph.cihula@intel.com>

62a3207b

15 8月, 2009 1 次提交

x86, intel_txt: Factor out the code for S3 setup · 58c41d28

由 H. Peter Anvin 提交于 8月 14, 2009

S3 sleep requires special setup in tboot.  However, the data
structures needed to do such setup are only available if
CONFIG_ACPI_SLEEP is enabled.  Abstract them out as much as possible,
so we can have a single tboot_setup_sleep() which either is a proper
implementation or a stub which simply calls BUG().
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Acked-by: NShane Wang <shane.wang@intel.com>
Cc: Joseph Cihula <joseph.cihula@intel.com>

58c41d28

12 8月, 2009 1 次提交

x86, intel_txt: tboot.c needs <asm/fixmap.h> · 81e2d7b3

由 H. Peter Anvin 提交于 8月 12, 2009

arch/x86/kernel/tboot.c needs <asm/fixmap.h>.  In most configurations
that ends up getting implicitly included, but not in all, causing
build failures in some configurations.
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Joseph Cihula <joseph.cihula@intel.com>
Cc: Shane Wang <shane.wang@intel.com>

81e2d7b3

22 7月, 2009 3 次提交

x86, intel_txt: Intel TXT Sx shutdown support · 86886e55

由 Joseph Cihula 提交于 6月 30, 2009

Support for graceful handling of sleep states (S3/S4/S5) after an Intel(R) TXT launch.

Without this patch, attempting to place the system in one of the ACPI sleep
states (S3/S4/S5) will cause the TXT hardware to treat this as an attack and
will cause a system reset, with memory locked. Not only may the subsequent
memory scrub take some time, but the platform will be unable to enter the
requested power state.

This patch calls back into the tboot so that it may properly and securely clean
up system state and clear the secrets-in-memory flag, after which it will place
the system into the requested sleep state using ACPI information passed by the kernel.

arch/x86/kernel/smpboot.c | 2 ++
drivers/acpi/acpica/hwsleep.c | 3 +++
kernel/cpu.c | 7 ++++++-
3 files changed, 11 insertions(+), 1 deletion(-)
Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
Signed-off-by: NShane Wang <shane.wang@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

86886e55

x86, intel_txt: Intel TXT reboot/halt shutdown support · 840c2baf

由 Joseph Cihula 提交于 6月 30, 2009

Support for graceful handling of kernel reboots after an Intel(R) TXT launch.

Without this patch, attempting to reboot or halt the system will cause the
TXT hardware to lock memory upon system restart because the secrets-in-memory
flag that was set on launch was never cleared.  This will in turn cause BIOS
to execute a TXT Authenticated Code Module (ACM) that will scrub all of memory
and then unlock it.  Depending on the amount of memory in the system and its type,
this may take some time.

This patch creates a 1:1 address mapping to the tboot module and then calls back
into tboot so that it may properly and securely clean up system state and clear
the secrets-in-memory flag.  When it has completed these steps, the tboot module
will reboot or halt the system.

 arch/x86/kernel/reboot.c |    8 ++++++++
 init/main.c              |    3 +++
 2 files changed, 11 insertions(+)
Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
Signed-off-by: NShane Wang <shane.wang@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

840c2baf

x86, intel_txt: Intel TXT boot support · 31625340

由 Joseph Cihula 提交于 6月 30, 2009

This patch adds kernel configuration and boot support for Intel Trusted
Execution Technology (Intel TXT).

Intel's technology for safer computing, Intel Trusted Execution
Technology (Intel TXT), defines platform-level enhancements that
provide the building blocks for creating trusted platforms.

Intel TXT was formerly known by the code name LaGrande Technology (LT).

Intel TXT in Brief:
o  Provides dynamic root of trust for measurement (DRTM)
o  Data protection in case of improper shutdown
o  Measurement and verification of launched environment

Intel TXT is part of the vPro(TM) brand and is also available some
non-vPro systems.  It is currently available on desktop systems based on
the Q35, X38, Q45, and Q43 Express chipsets (e.g. Dell Optiplex 755, HP
dc7800, etc.) and mobile systems based on the GM45, PM45, and GS45
Express chipsets.

For more information, see http://www.intel.com/technology/security/.
This site also has a link to the Intel TXT MLE Developers Manual, which
has been updated for the new released platforms.

A much more complete description of how these patches support TXT, how to
configure a system for it, etc. is in the Documentation/intel_txt.txt file
in this patch.

This patch provides the TXT support routines for complete functionality,
documentation for TXT support and for the changes to the boot_params structure,
and boot detection of a TXT launch.  Attempts to shutdown (reboot, Sx) the system
will result in platform resets; subsequent patches will support these shutdown modes
properly.

 Documentation/intel_txt.txt      |  210 +++++++++++++++++++++
 Documentation/x86/zero-page.txt  |    1
 arch/x86/include/asm/bootparam.h |    3
 arch/x86/include/asm/fixmap.h    |    3
 arch/x86/include/asm/tboot.h     |  197 ++++++++++++++++++++
 arch/x86/kernel/Makefile         |    1
 arch/x86/kernel/setup.c          |    4
 arch/x86/kernel/tboot.c          |  379 +++++++++++++++++++++++++++++++++++++++
 security/Kconfig                 |   30 +++
 9 files changed, 827 insertions(+), 1 deletion(-)
Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
Signed-off-by: NShane Wang <shane.wang@intel.com>
Signed-off-by: NGang Wei <gang.wei@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

31625340

14 7月, 2009 1 次提交

x86: Fix warning in pvclock.c · 2ad76643

由 Dave Jones 提交于 7月 13, 2009

when building 32-bit, I see this ..
arch/x86/kernel/pvclock.c:63:7: warning: "__x86_64__" is not defined
Signed-off-by: NDave Jones <davej@redhat.com>
LKML-Reference: <20090713201437.GA12165@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

2ad76643

13 7月, 2009 2 次提交

x86, apic: Fix false positive section mismatch in numaq_32.c · 7473727b

由 Rakib Mullick 提交于 7月 12, 2009

The variable apic_numaq placed in noninit section references the
function wakeup_secondary_cpu_via_nmi(), which is in __cpuinit
section. Thus causes a section mismatch warning. To avoid such
mismatch we mark apic_numaq as __refdata.

We were warned by the following warning:

  WARNING: arch/x86/kernel/built-in.o(.data+0x932c): Section mismatch in
  reference from the variable apic_numaq to the function
  .cpuinit.text:wakeup_secondary_cpu_via_nmi()
Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120407p6b4f67dtf4d563155488188a@mail.gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7473727b

x86: Fix false positive section mismatch in es7000_32.c · 151586d0

由 Rakib Mullick 提交于 7月 12, 2009

The variable apic_es7000_cluster references the function __cpuinit
wakeup_secondary_cpu_via_mip() from a noninit section. So we've been
warned by the following warning. To avoid possible collision between
init/noninit, its best to mark the variable as __refdata.

We were warned by the following warning:

  LD      arch/x86/kernel/apic/built-in.o
  WARNING: arch/x86/kernel/apic/built-in.o(.data+0x198c): Section
  mismatch in reference from the variable apic_es7000_cluster to the
  function .cpuinit.text:wakeup_secondary_cpu_via_mip()
Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120404k6279a10ch5e9682432272706f@mail.gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

151586d0

11 7月, 2009 1 次提交

x86/pci: insert ioapic resource before assigning unassigned resources · 857fdc53

由 Yinghai Lu 提交于 7月 10, 2009

Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?)

Dann bisected it down to:
  commit 30a18d6c
  Date:   Tue Feb 19 03:21:20 2008 -0800

      x86: multi pci root bus with different io resource range, on
      64-bit

It turns out that:
  1. that AMD-based systems have two HT chains.
  2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc
  3. that multi-peer-root patch will try to split root resources to peer
     root resources according to PCI conf of NB
  4. PCI core assigns unassigned resources, but they overlap with BARs
     that are used by ioapic addr of io4 and 8132.

The reason: at that point ioapic address are not inserted yet.  Solution
is to insert ioapic resources into the tree a bit earlier.
Reported-by: NStephen Frost <sfrost@snowman.net>
Reported-and-Tested-by: Ndann frazier <dannf@hp.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: NJesse Barnes <jbarnes@jbarnes-g45.(none)>

857fdc53

09 7月, 2009 1 次提交

Remove multiple KERN_ prefixes from printk formats · ad361c98

由 Joe Perches 提交于 7月 06, 2009

Commit 5fd29d6c ("printk: clean up
handling of log-levels and newlines") changed printk semantics.  printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.

<level> is now included in the output on each additional use.

Remove all uses of multiple KERN_<level>s in formats.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ad361c98

07 7月, 2009 2 次提交

[CPUFREQ] Powernow-k8: support family 0xf with 2 low p-states · a2e1b4c3

由 Mark Langsdorf 提交于 7月 26, 2009

Provide support for family 0xf processors with 2 P-states
below the elevator voltage.  Remove the checks that prevent
this configuration from being supported and increase the
transition voltage to prevent errors during the transition.
Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: NDave Jones <davej@redhat.com>

a2e1b4c3

gcov: exclude code operating in userspace from profiling · f386c61f

由 Peter Oberparleiter 提交于 7月 05, 2009

Fix for this issue on x86_64:

rostedt@goodmis.org wrote:
> On bootup of the latest kernel my init segfaults. Debugging it,
> I found  that vread_tsc (a vsyscall) increments some strange
> kernel memory:
>
> 0000000000000000 <vread_tsc>:
>    0:   55                      push   %rbp
>    1:   48 ff 05 00 00 00 00    incq   0(%rip)
>                         # 8 <vread_tsc+0x8>
>                         4: R_X86_64_PC32        .bss+0x3c
>    8:   48 89 e5                mov    %rsp,%rbp
>    b:   66 66 90                xchg   %ax,%ax
>    e:   48 ff 05 00 00 00 00    incq   0(%rip)
>                         # 15 <vread_tsc+0x15>
>                         11: R_X86_64_PC32       .bss+0x44
>   15:   66 66 90                xchg   %ax,%ax
>   18:   48 ff 05 00 00 00 00    incq   0(%rip)
>                         # 1f <vread_tsc+0x1f>
>                         1b: R_X86_64_PC32       .bss+0x4c
>   1f:   0f 31                   rdtsc
>
>
> Those "incq" is very bad to happen in vsyscall memory, since
> userspace can not modify it. You need to make something prevent
> profiling of vsyscall  memory (like I do with ftrace).
Signed-off-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Reported-by: NSteven Rostedt <rostedt@goodmis.org>
Tested-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f386c61f

03 7月, 2009 5 次提交

x86: Remove unused function lapic_watchdog_ok() · c7210e1f

由 Jaswinder Singh Rajput 提交于 7月 02, 2009

lapic_watchdog_ok() is a global function but no one is using it.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1246554335.2242.29.camel@jaswinder.satnam>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c7210e1f

x86: Remove unused variable disable_x2apic · 23d0cd8e

由 Jaswinder Singh Rajput 提交于 7月 02, 2009

setup_nox2apic() is writing 1 to disable_x2apic but no one is reading it.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1246554239.2242.27.camel@jaswinder.satnam>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

23d0cd8e

x86, kvm: Fix section mismatches in kvm.c · d3ac8815

由 Rakib Mullick 提交于 7月 02, 2009

The function paravirt_ops_setup() has been refering the
variable no_timer_check, which is a __initdata. Thus generates
the following warning. paravirt_ops_setup() function is called
from kvm_guest_init() which is a __init function. So to fix
this we mark paravirt_ops_setup as __init.

The sections-check output that warned us about this was:

   LD      arch/x86/built-in.o
  WARNING: arch/x86/built-in.o(.text+0x166ce): Section mismatch in
  reference from the function paravirt_ops_setup() to the variable
  .init.data:no_timer_check
  The function paravirt_ops_setup() references
  the variable __initdata no_timer_check.
  This is often because paravirt_ops_setup lacks a __initdata
  annotation or the annotation of no_timer_check is wrong.
Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
Acked-by: NAvi Kivity <avi@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <b9df5fa10907012240y356427b8ta4bd07f0efc6a049@mail.gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d3ac8815

x86: add boundary check for 32bit res before expand e820 resource to alignment · 7c5371c4

由 Yinghai Lu 提交于 7月 01, 2009

fix hang with HIGHMEM_64G and 32bit resource.  According to hpa and
Linus, use (resource_size_t)-1 to fend off big ranges.

Analyzed by hpa
Reported-and-tested-by: NMikael Pettersson <mikpe@it.uu.se>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c5371c4

amd-iommu: set evt_buf_size correctly · 1bc6f838

由 Joerg Roedel 提交于 7月 02, 2009

The setting of this variable got lost during the suspend/resume
implementation.  But keeping this variable zero causes a divide-by-zero
error in the interrupt handler. This patch fixes this.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

1bc6f838

02 7月, 2009 4 次提交

amd-iommu: handle alias entries correctly in init code · 7a6a3a08

由 Joerg Roedel 提交于 7月 02, 2009

An alias entry in the ACPI table means that the device can send requests to the
IOMMU with both device ids, its own and the alias. This is not handled properly
in the ACPI init code. This patch fixes the issue.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

7a6a3a08

x86: Fix printk call in print_local_apic() · 251e1e44

由 Ingo Molnar 提交于 7月 02, 2009

Instead of this:

[   75.690022] <7>printing local APIC contents on CPU#0/0:
[   75.704406] ... APIC ID:      00000000 (0)
[   75.707905] ... APIC VERSION: 00060015
[   75.722551] ... APIC TASKPRI: 00000000 (00)
[   75.725473] ... APIC PROCPRI: 00000000
[   75.728592] ... APIC LDR: 00000001
[   75.742137] ... APIC SPIV: 000001ff
[   75.744101] ... APIC ISR field:
[   75.746648] 0123456789abcdef0123456789abcdef
[   75.746649] <7>00000000000000000000000000000000

Improve the code to be saner and simpler and just print out
the bitfield in a single line using hexa values - not as a
(rather pointless) binary bitfield.

Partially reused Linus's initial fix for this.
Reported-and-Tested-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A4C43BC.90506@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

251e1e44

perf_counter: Ignore the nmi call frames in the x86-64 backtraces · 0406ca6d

由 Frederic Weisbecker 提交于 7月 01, 2009

About every callchains recorded with perf record are filled up
including the internal perfcounter nmi frame:

 perf_callchain
 perf_counter_overflow
 intel_pmu_handle_irq
 perf_counter_nmi_handler
 notifier_call_chain
 atomic_notifier_call_chain
 notify_die
 do_nmi
 nmi

We want ignore this frame as it's not interesting for
instrumentation. To solve this, we simply ignore every frames
from nmi context.

New example of "perf report -s sym -c" after this patch:

9.59%  [k] search_by_key
             4.88%
                search_by_key
                reiserfs_read_locked_inode
                reiserfs_iget
                reiserfs_lookup
                do_lookup
                __link_path_walk
                path_walk
                do_path_lookup
                user_path_at
                vfs_fstatat
                vfs_lstat
                sys_newlstat
                system_call_fastpath
                __lxstat
                0x406fb1

             3.19%
                search_by_key
                search_by_entry_key
                reiserfs_find_entry
                reiserfs_lookup
                do_lookup
                __link_path_walk
                path_walk
                do_path_lookup
                user_path_at
                vfs_fstatat
                vfs_lstat
                sys_newlstat
                system_call_fastpath
                __lxstat
                0x406fb1
[...]

For now this patch only solves the problem in x86-64.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0406ca6d

D
intel-iommu: Make iommu=pt work on i386 too · 3238c0c4
由 David Woodhouse 提交于 7月 01, 2009
```
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
```
3238c0c4

01 7月, 2009 1 次提交

x86: Mark device_nb as static and fix NULL noise · b25ae679

由 Jaswinder Singh Rajput 提交于 7月 01, 2009

This sparse warning:

  arch/x86/kernel/amd_iommu.c:1195:23: warning: symbol 'device_nb' was not declared. Should it be static?

triggers because device_nb is global but is only used in a
single .c file. change device_nb to static to fix that - this
also addresses the sparse warning.

This sparse warning:

  arch/x86/kernel/amd_iommu.c:1766:10: warning: Using plain integer as NULL pointer

triggers because plain integer 0 is used in place of a NULL
pointer. change 0 to NULL to fix that - this also address the
sparse warning.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <1246458194.6940.20.camel@hpdv5.satnam>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b25ae679

29 6月, 2009 1 次提交

perf_counter, x86: Update x86_pmu after WARN() · 4078c444

由 Yinghai Lu 提交于 6月 29, 2009

The print out should read the value before changing the value.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4A487017.4090007@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4078c444

28 6月, 2009 1 次提交

Revert "x86: cap iomem_resource to addressable physical memory" · ff8a4bae

由 H. Peter Anvin 提交于 6月 27, 2009

This reverts commit 95ee14e4.
Mikael Petterson <mikepe@it.uu.se> reported that at least one of his
systems will not boot as a result.  We have ruled out the detection
algorithm malfunctioning, so it is not a matter of producing the
incorrect bitmasks; rather, something in the application of them
fails.

Revert the commit until we can root cause and correct this problem.

-stable team: this means the underlying commit should be rejected.
Reported-and-isolated-by: NMikael Petterson <mikpe@it.uu.se>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
LKML-Reference: <200906261559.n5QFxJH8027336@pilspetsen.it.uu.se>
Cc: stable@kernel.org
Cc: Grant Grundler <grundler@parisc-linux.org>

ff8a4bae

26 6月, 2009 3 次提交

x86, mce: percpu mcheck_timer should be pinned · 5be6066a

由 Hidetoshi Seto 提交于 6月 24, 2009

If CONFIG_NO_HZ + CONFIG_SMP, timer added via add_timer() might
be migrated on other cpu.  Use add_timer_on() instead.

Avoids the following failure:

Maciej Rutecki wrote:
> > After normal boot I try:
> >
> > echo 1 > /sys/devices/system/machinecheck/machinecheck0/check_interval
> >
> > I found this in dmesg:
> >
> > [  141.704025] ------------[ cut here ]------------
> > [  141.704039] WARNING: at arch/x86/kernel/cpu/mcheck/mce.c:1102
> > mcheck_timer+0xf5/0x100()
Reported-by: NMaciej Rutecki <maciej.rutecki@gmail.com>
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Tested-by: NMaciej Rutecki <maciej.rutecki@gmail.com>
Acked-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

5be6066a

x86: Add sysctl to allow panic on IOCK NMI error · 5211a242

由 Kurt Garloff 提交于 6月 24, 2009

This patch introduces a new sysctl:

    /proc/sys/kernel/panic_on_io_nmi

which defaults to 0 (off).

When enabled, the kernel panics when the kernel receives an NMI
caused by an IO error.

The IO error triggered NMI indicates a serious system
condition, which could result in IO data corruption. Rather
than contiuing, panicing and dumping might be a better choice,
so one can figure out what's causing the IO error.

This could be especially important to companies running IO
intensive applications where corruption must be avoided, e.g. a
bank's databases.

[ SuSE has been shipping it for a while, it was done at the
  request of a large database vendor, for their users. ]
Signed-off-by: NKurt Garloff <garloff@suse.de>
Signed-off-by: NRoberto Angelino <robertangelino@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <20090624213211.GA11291@kroah.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5211a242

perf_counter, x86: Add mmap counter read support · 194002b2

由 Peter Zijlstra 提交于 6月 22, 2009

Update the mmap control page with the needed information to
use the userspace RDPMC instruction for self monitoring.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

194002b2

24 6月, 2009 3 次提交

x86: Fix uv bau sending buffer initialization · 9c26f52b

由 Cliff Wickman 提交于 6月 24, 2009

The initialization of the UV Broadcast Assist Unit's sending
buffers was making an invalid assumption about the
initialization of an MMR that defines its address.

The BIOS will not be providing that MMR.  So
uv_activation_descriptor_init() should unconditionally set it.

Tested on UV simulator.
Signed-off-by: NCliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org> # for v2.6.30.x
LKML-Reference: <E1MJTfj-0005i1-W8@eag09.americas.sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9c26f52b

perf_counter, x86: Set global control MSR correctly · c14dab5c

由 Yong Wang 提交于 6月 24, 2009

Previous code made an assumption that the power on value of global
control MSR has enabled all fixed and general purpose counters properly.

However, this is not the case for certain Intel processors, such as
Atom - and it might also be firmware dependent.

Each enable bit in IA32_PERF_GLOBAL_CTRL is AND'ed with the
enable bits for all privilege levels in the respective IA32_PERFEVTSELx
or IA32_PERF_FIXED_CTR_CTRL MSRs to start/stop the counting of
respective counters. Counting is enabled if the AND'ed results is true;
counting is disabled when the result is false.

The end result is that all fixed counters are always disabled on Atom
processors because the assumption is just invalid.

Fix this by not initializing the ctrl-mask out of the global MSR,
but setting it to perf_counter_mask.
Reported-by: NStephane Eranian <eranian@googlemail.com>
Signed-off-by: NYong Wang <yong.y.wang@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090624021324.GA2788@ywang-moblin2.bj.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c14dab5c

Intel-IOMMU, intr-remap: source-id checking · f007e99c

由 Weidong Han 提交于 5月 23, 2009

To support domain-isolation usages, the platform hardware must be
capable of uniquely identifying the requestor (source-id) for each
interrupt message. Without source-id checking for interrupt remapping
, a rouge guest/VM with assigned devices can launch interrupt attacks
to bring down anothe guest/VM or the VMM itself.

This patch adds source-id checking for interrupt remapping, and then
really isolates interrupts for guests/VMs with assigned devices.

Because PCI subsystem is not initialized yet when set up IOAPIC
entries, use read_pci_config_byte to access PCI config space directly.
Signed-off-by: NWeidong Han <weidong.han@intel.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

f007e99c

23 6月, 2009 1 次提交

x86: Move init_gbpages() to setup_arch() · 854c879f

由 Pekka J Enberg 提交于 6月 22, 2009

The init_gbpages() function is conditionally called from
init_memory_mapping() function. There are two call-sites where
this 'after_bootmem' condition can be true: setup_arch() and
mem_init() via pci_iommu_alloc().

Therefore, it's safe to move the call to init_gbpages() to
setup_arch() as it's always called before mem_init().

This removes an after_bootmem use - paving the way to remove
all uses of that state variable.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <Pine.LNX.4.64.0906221731210.19474@melkki.cs.Helsinki.FI>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

854c879f

22 6月, 2009 6 次提交

x86: ensure percpu lpage doesn't consume too much vmalloc space · 0017c869

由 Tejun Heo 提交于 6月 22, 2009

On extreme configuration (e.g. 32bit 32-way NUMA machine), lpage
percpu first chunk allocator can consume too much of vmalloc space.
Make it fall back to 4k allocator if the consumption goes over 20%.

[ Impact: add sanity check for lpage percpu first chunk allocator ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NJan Beulich <JBeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>

0017c869

x86: implement percpu_alloc kernel parameter · fa8a7094

由 Tejun Heo 提交于 6月 22, 2009

According to Andi, it isn't clear whether lpage allocator is worth the
trouble as there are many processors where PMD TLB is far scarcer than
PTE TLB.  The advantage or disadvantage probably depends on the actual
size of percpu area and specific processor.  As performance
degradation due to TLB pressure tends to be highly workload specific
and subtle, it is difficult to decide which way to go without more
data.

This patch implements percpu_alloc kernel parameter to allow selecting
which first chunk allocator to use to ease debugging and testing.

While at it, make sure all the failure paths report why something
failed to help determining why certain allocator isn't working.  Also,
kill the "Great future plan" comment which had already been realized
quite some time ago.

[ Impact: allow explicit percpu first chunk allocator selection ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NJan Beulich <JBeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>

fa8a7094

x86: fix pageattr handling for lpage percpu allocator and re-enable it · e59a1bb2

由 Tejun Heo 提交于 6月 22, 2009

lpage allocator aliases a PMD page for each cpu and returns whatever
is unused to the page allocator.  When the pageattr of the recycled
pages are changed, this makes the two aliases point to the overlapping
regions with different attributes which isn't allowed and known to
cause subtle data corruption in certain cases.

This can be handled in simliar manner to the x86_64 highmap alias.
pageattr code should detect if the target pages have PMD alias and
split the PMD alias and synchronize the attributes.

pcpur allocator is updated to keep the allocated PMD pages map sorted
in ascending address order and provide pcpu_lpage_remapped() function
which binary searches the array to determine whether the given address
is aliased and if so to which address.  pageattr is updated to use
pcpu_lpage_remapped() to detect the PMD alias and split it up as
necessary from cpa_process_alias().

Jan Beulich spotted the original problem and incorrect usage of vaddr
instead of laddr for lookup.

With this, lpage percpu allocator should work correctly.  Re-enable
it.

[ Impact: fix subtle lpage pageattr bug and re-enable lpage ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NJan Beulich <JBeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>

e59a1bb2

x86: prepare setup_pcpu_lpage() for pageattr fix · 0ff2587f

由 Tejun Heo 提交于 6月 22, 2009

Make the following changes in preparation of coming pageattr updates.

* Define and use array of struct pcpul_ent instead of array of
  pointers.  The only difference is ->cpu field which is set but
  unused yet.

* Rename variables according to the above change.

* Rename local variable vm to pcpul_vm and move it out of the
  function.

[ Impact: no functional difference ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>

0ff2587f

x86: rename remap percpu first chunk allocator to lpage · 97c9bf06

由 Tejun Heo 提交于 6月 22, 2009

The "remap" allocator remaps large pages to build the first chunk;
however, the name isn't very good because 4k allocator remaps too and
the whole point of the remap allocator is using large page mapping.
The allocator will be generalized and exported outside of x86, rename
it to lpage before that happens.

percpu_alloc kernel parameter is updated to accept both "remap" and
"lpage" for lpage allocator.

[ Impact: code cleanup, kernel parameter argument updated ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>

97c9bf06

x86: fix duplicate free in setup_pcpu_remap() failure path · c5806df9

由 Tejun Heo 提交于 6月 22, 2009

In the failure path, setup_pcpu_remap() tries to free the area which
has already been freed to make holes in the large page.  Fix it.

[ Impact: fix duplicate free in failure path ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>

c5806df9

21 6月, 2009 2 次提交

perf_counter, x8: Fix L1-data-Cache-Store-Referencees for AMD · d9f2a5ec

由 Jaswinder Singh Rajput 提交于 6月 20, 2009

Fix AMD's Data Cache Refills from System event.

After this patch :

./tools/perf/perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses ls /dev/ > /dev/null

Performance counter stats for 'ls /dev/':

2499484 L1-data-Cache-Load-Referencees (scaled from 3.97%)
70347 L1-data-Cache-Load-Misses (scaled from 7.30%)
9360 L1-data-Cache-Store-Referencees (scaled from 8.64%)
32804 L1-data-Cache-Prefetch-Referencees (scaled from 17.72%)
7693 L1-data-Cache-Prefetch-Misses (scaled from 22.97%)
2180945 L1-instruction-Cache-Load-Referencees (scaled from 28.48%)
14518 L1-instruction-Cache-Load-Misses (scaled from 35.00%)
2405 L1-instruction-Cache-Prefetch-Referencees (scaled from 34.89%)
71387 L2-Cache-Load-Referencees (scaled from 34.94%)
18732 L2-Cache-Load-Misses (scaled from 34.92%)
79918 L2-Cache-Store-Referencees (scaled from 36.02%)
1295294 Data-TLB-Cache-Load-Referencees (scaled from 35.99%)
30896 Data-TLB-Cache-Load-Misses (scaled from 33.36%)
1222030 Instruction-TLB-Cache-Load-Referencees (scaled from 29.46%)
357 Instruction-TLB-Cache-Load-Misses (scaled from 20.46%)
530888 Branch-Cache-Load-Referencees (scaled from 11.48%)
8638 Branch-Cache-Load-Misses (scaled from 5.09%)

0.011295149 seconds time elapsed.

Earlier it always shows value 0.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
LKML-Reference: <1245484165.3102.6.camel@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d9f2a5ec

x86: Set cpu_llc_id on AMD CPUs · 99bd0c0f

由 Andreas Herrmann 提交于 6月 19, 2009

This counts when building sched domains in case NUMA information
is not available.

( See cpu_coregroup_mask() which uses llc_shared_map which in turn is
  created based on cpu_llc_id. )

Currently Linux builds domains as follows:
(example from a dual socket quad-core system)

 CPU0 attaching sched-domain:
  domain 0: span 0-7 level CPU
   groups: 0 1 2 3 4 5 6 7

  ...

 CPU7 attaching sched-domain:
  domain 0: span 0-7 level CPU
   groups: 7 0 1 2 3 4 5 6

Ever since that is borked for multi-core AMD CPU systems.
This patch fixes that and now we get a proper:

 CPU0 attaching sched-domain:
  domain 0: span 0-3 level MC
   groups: 0 1 2 3
   domain 1: span 0-7 level CPU
    groups: 0-3 4-7

  ...

 CPU7 attaching sched-domain:
  domain 0: span 4-7 level MC
   groups: 7 4 5 6
   domain 1: span 0-7 level CPU
    groups: 4-7 0-3

This allows scheduler to assign tasks to cores on different sockets
(i.e. that don't share last level cache) for performance reasons.
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20090619085909.GJ5218@alberich.amd.com>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

99bd0c0f