提交 · f4bcd8ccddb02833340652e9f46f5127828eb79d · openeuler / Kernel

16 1月, 2014 4 次提交

x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs · 7da7c156

由 Bin Gao 提交于 10月 21, 2013

On SoCs that have the calibration MSRs available, either there is no
PIT, HPET or PMTIMER to calibrate against, or the PIT/HPET/PMTIMER is
driven from the same clock as the TSC, so calibration is redundant and
just slows down the boot.

TSC rate is caculated by this formula:
<maximum core-clock to bus-clock ratio> * <maximum resolved frequency>
The ratio and the resolved frequency ID can be obtained from MSR.
See Intel 64 and IA-32 System Programming Guid section 16.12 and 30.11.5
for details.
Signed-off-by: NBin Gao <bin.gao@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org

7da7c156

x86: Add check for number of available vectors before CPU down · da6139e4

由 Prarit Bhargava 提交于 1月 13, 2014

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791

When a cpu is downed on a system, the irqs on the cpu are assigned to
other cpus.  It is possible, however, that when a cpu is downed there
aren't enough free vectors on the remaining cpus to account for the
vectors from the cpu that is being downed.

This results in an interesting "overflow" condition where irqs are
"assigned" to a CPU but are not handled.

For example, when downing cpus on a 1-64 logical processor system:

<snip>
[  232.021745] smpboot: CPU 61 is now offline
[  238.480275] smpboot: CPU 62 is now offline
[  245.991080] ------------[ cut here ]------------
[  245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
[  246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
[  246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
[  246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
[  246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[  246.057371]  0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
[  246.065728]  ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
[  246.074073]  0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
[  246.082430] Call Trace:
[  246.085174]  <IRQ>  [<ffffffff8164fbf6>] dump_stack+0x46/0x58
[  246.091633]  [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
[  246.098352]  [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
[  246.104786]  [<ffffffff815710d6>] dev_watchdog+0x246/0x250
[  246.110923]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[  246.119097]  [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
[  246.125224]  [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
[  246.132137]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[  246.140308]  [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
[  246.146933]  [<ffffffff81059a80>] __do_softirq+0xe0/0x220
[  246.152976]  [<ffffffff8165fedc>] call_softirq+0x1c/0x30
[  246.158920]  [<ffffffff810045f5>] do_softirq+0x55/0x90
[  246.164670]  [<ffffffff81059d35>] irq_exit+0xa5/0xb0
[  246.170227]  [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
[  246.177324]  [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
[  246.184041]  <EOI>  [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
[  246.191559]  [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
[  246.198374]  [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
[  246.204900]  [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
[  246.210846]  [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
[  246.217371]  [<ffffffff81646b47>] rest_init+0x77/0x80
[  246.223028]  [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
[  246.229165]  [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
[  246.235787]  [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
[  246.242990]  [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
[  246.249610] ---[ end trace fb74fdef54d79039 ]---
[  246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
[  246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
Last login: Mon Nov 11 08:35:14 from 10.18.17.119
[root@(none) ~]# [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX

(last lines keep repeating.  ixgbe driver is dead until module reload.)

If the downed cpu has more vectors than are free on the remaining cpus on the
system, it is possible that some vectors are "orphaned" even though they are
assigned to a cpu.  In this case, since the ixgbe driver had a watchdog, the
watchdog fired and notified that something was wrong.

This patch adds a function, check_vectors(), to compare the number of vectors
on the CPU going down and compares it to the number of vectors available on
the system.  If there aren't enough vectors for the CPU to go down, an
error is returned and propogated back to userspace.

v2: Do not need to look at percpu irqs
v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
    Priority Mode
v4: Additional changes suggested by Gong Chen.
v5/v6/v7/v8: Updated comment text
Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.comReviewed-by: NGong Chen <gong.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Janet Morgan <janet.morgan@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ruiv Wang <ruiv.wang@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>

da6139e4

x86, intel-mid: Add Merrifield platform support · bc20aa48

由 David Cohen 提交于 12月 16, 2013

This code was partially based on Mark Brown's previous work.
Signed-off-by: NDavid Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-4-git-send-email-david.a.cohen@linux.intel.comSigned-off-by: NFei Yang <fei.yang@intel.com>
Cc: Mark F. Brown <mark.f.brown@intel.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

bc20aa48

x86, intel-mid: Add Clovertrail platform support · 85611e3f

由 Kuppuswamy Sathyanarayanan 提交于 12月 16, 2013

This patch adds Clovertrail support on intel-mid and makes it more
flexible to support other SoCs.
Signed-off-by: NDavid Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-3-git-send-email-david.a.cohen@linux.intel.comSigned-off-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: NFei Yang <fei.yang@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

85611e3f

14 1月, 2014 4 次提交

x86, microcode, AMD: Fix early ucode loading · 5335ba5c

由 Borislav Petkov 提交于 11月 29, 2013

The original idea to use the microcode cache for the APs doesn't pan out
because we do memory allocation there very early and with IRQs disabled
and we don't want to involve GFP_ATOMIC allocations. Not if it can be
helped.

Thus, extend the caching of the BSP patch approach to the APs and
iterate over the ucode in the initrd instead of using the cache. We
still save the relevant patches to it but later, right before we
jettison the initrd.

While at it, fix early ucode loading on 32-bit too.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>

5335ba5c

x86, microcode: Share native MSR accessing variants · e1b43e3f

由 Borislav Petkov 提交于 12月 04, 2013

We want to use those in AMD's early loading path too. Also, add a
native_wrmsrl variant.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>

e1b43e3f

x86, ramdisk: Export relocated ramdisk VA · 5aa3d718

由 Borislav Petkov 提交于 12月 04, 2013

The ramdisk can possibly get relocated if the whole image is not mapped.
And since we're going over it in the microcode loader and fishing out
the relevant microcode patches, we want access it at its new location.
Thus, export it.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>

5aa3d718

sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding · 8cb75e0c

由 Peter Zijlstra 提交于 11月 20, 2013

With various drivers wanting to inject idle time; we get people
calling idle routines outside of the idle loop proper.

Therefore we need to be extra careful about not missing
TIF_NEED_RESCHED -> PREEMPT_NEED_RESCHED propagations.

While looking at this, I also realized there's a small window in the
existing idle loop where we can miss TIF_NEED_RESCHED; when it hits
right after the tif_need_resched() test at the end of the loop but
right before the need_resched() test at the start of the loop.

So move preempt_fold_need_resched() out of the loop where we're
guaranteed to have TIF_NEED_RESCHED set.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-x9jgh45oeayzajz2mjt0y7d6@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

8cb75e0c

13 1月, 2014 3 次提交

sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs · 20d1c86a

由 Peter Zijlstra 提交于 11月 29, 2013

Use a ring-buffer like multi-version object structure which allows
always having a coherent object; we use this to avoid having to
disable IRQs while reading sched_clock() and avoids a problem when
getting an NMI while changing the cyc2ns data.

                        MAINLINE   PRE        POST

    sched_clock_stable: 1          1          1
    (cold) sched_clock: 329841     331312     257223
    (cold) local_clock: 301773     310296     309889
    (warm) sched_clock: 38375      38247      25280
    (warm) local_clock: 100371     102713     85268
    (warm) rdtsc:       27340      27289      24247
    sched_clock_stable: 0          0          0
    (cold) sched_clock: 382634     372706     301224
    (cold) local_clock: 396890     399275     399870
    (warm) sched_clock: 38194      38124      25630
    (warm) local_clock: 143452     148698     129629
    (warm) rdtsc:       27345      27365      24307
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-s567in1e5ekq2nlyhn8f987r@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

20d1c86a

sched/clock, x86: Move some cyc2ns() code around · 57c67da2

由 Peter Zijlstra 提交于 11月 29, 2013

There are no __cycles_2_ns() users outside of arch/x86/kernel/tsc.c,
so move it there.

There are no cycles_2_ns() users.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-01lslnavfgo3kmbo4532zlcj@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

57c67da2

sched/clock, x86: Use mul_u64_u32_shr() for native_sched_clock() · 5dd12c21

由 Peter Zijlstra 提交于 11月 29, 2013

Use mul_u64_u32_shr() so that x86_64 can use a single 64x64->128 mul.

Before:

0000000000000560 <native_sched_clock>:
 560:   44 8b 1d 00 00 00 00    mov    0x0(%rip),%r11d        # 567 <native_sched_clock+0x7>
 567:   55                      push   %rbp
 568:   48 89 e5                mov    %rsp,%rbp
 56b:   45 85 db                test   %r11d,%r11d
 56e:   75 4f                   jne    5bf <native_sched_clock+0x5f>
 570:   0f 31                   rdtsc
 572:   89 c0                   mov    %eax,%eax
 574:   48 c1 e2 20             shl    $0x20,%rdx
 578:   48 c7 c1 00 00 00 00    mov    $0x0,%rcx
 57f:   48 09 c2                or     %rax,%rdx
 582:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
 589:   65 8b 04 25 00 00 00    mov    %gs:0x0,%eax
 590:   00
 591:   48 98                   cltq
 593:   48 8b 34 c5 00 00 00    mov    0x0(,%rax,8),%rsi
 59a:   00
 59b:   48 89 d0                mov    %rdx,%rax
 59e:   81 e2 ff 03 00 00       and    $0x3ff,%edx
 5a4:   48 c1 e8 0a             shr    $0xa,%rax
 5a8:   48 0f af 14 0e          imul   (%rsi,%rcx,1),%rdx
 5ad:   48 0f af 04 0e          imul   (%rsi,%rcx,1),%rax
 5b2:   5d                      pop    %rbp
 5b3:   48 03 04 3e             add    (%rsi,%rdi,1),%rax
 5b7:   48 c1 ea 0a             shr    $0xa,%rdx
 5bb:   48 01 d0                add    %rdx,%rax
 5be:   c3                      retq

After:

0000000000000550 <native_sched_clock>:
 550:   8b 3d 00 00 00 00       mov    0x0(%rip),%edi        # 556 <native_sched_clock+0x6>
 556:   55                      push   %rbp
 557:   48 89 e5                mov    %rsp,%rbp
 55a:   48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
 55e:   85 ff                   test   %edi,%edi
 560:   75 2c                   jne    58e <native_sched_clock+0x3e>
 562:   0f 31                   rdtsc
 564:   89 c0                   mov    %eax,%eax
 566:   48 c1 e2 20             shl    $0x20,%rdx
 56a:   48 09 c2                or     %rax,%rdx
 56d:   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
 574:   00 00
 576:   89 c0                   mov    %eax,%eax
 578:   48 f7 e2                mul    %rdx
 57b:   65 48 8b 0c 25 00 00    mov    %gs:0x0,%rcx
 582:   00 00
 584:   c9                      leaveq
 585:   48 0f ac d0 0a          shrd   $0xa,%rdx,%rax
 58a:   48 01 c8                add    %rcx,%rax
 58d:   c3                      retq

                        MAINLINE   POST

    sched_clock_stable: 1	   1
    (cold) sched_clock: 329841     331312
    (cold) local_clock: 301773     310296
    (warm) sched_clock: 38375      38247
    (warm) local_clock: 100371     102713
    (warm) rdtsc:       27340      27289
    sched_clock_stable: 0          0
    (cold) sched_clock: 382634     372706
    (cold) local_clock: 396890     399275
    (warm) sched_clock: 38194      38124
    (warm) local_clock: 143452     148698
    (warm) rdtsc:       27345      27365
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-piu203ses5y1g36bnyw2n16x@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

5dd12c21

12 1月, 2014 3 次提交

x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs · 9345005f

由 Prarit Bhargava 提交于 1月 05, 2014

During heavy CPU-hotplug operations the following spurious kernel warnings
can trigger:

  do_IRQ: No ... irq handler for vector (irq -1)

  [ See: https://bugzilla.kernel.org/show_bug.cgi?id=64831 ]

When downing a cpu it is possible that there are unhandled irqs
left in the APIC IRR register.  The following code path shows
how the problem can occur:

 1. CPU 5 is to go down.

 2. cpu_disable() on CPU 5 executes with interrupt flag cleared
    by local_irq_save() via stop_machine().

 3. IRQ 12 asserts on CPU 5, setting IRR but not ISR because
    interrupt flag is cleared (CPU unabled to handle the irq)

 4. IRQs are migrated off of CPU 5, and the vectors' irqs are set
    to -1. 5. stop_machine() finishes cpu_disable()

 6. cpu_die() for CPU 5 executes in normal context.

 7. CPU 5 attempts to handle IRQ 12 because the IRR is set for
    IRQ 12.  The code attempts to find the vector's IRQ and cannot
    because it has been set to -1. 8. do_IRQ() warning displays
    warning about CPU 5 IRQ 12.

I added a debug printk to output which CPU & vector was
retriggered and discovered that that we are getting bogus
events.  I see a 100% correlation between this debug printk in
fixup_irqs() and the do_IRQ() warning.

This patchset resolves this by adding definitions for
VECTOR_UNDEFINED(-1) and VECTOR_RETRIGGERED(-2) and modifying
the code to use them.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=64831Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
Reviewed-by: NRui Wang <rui.y.wang@intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: janet.morgan@Intel.com
Cc: tony.luck@Intel.com
Cc: ruiv.wang@gmail.com
Link: http://lkml.kernel.org/r/1388938252-16627-1-git-send-email-prarit@redhat.com
[ Cleaned up the code a bit. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

9345005f

arch: Introduce smp_load_acquire(), smp_store_release() · 47933ad4

由 Peter Zijlstra 提交于 11月 06, 2013

A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads.  Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.

This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation.  The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes.  These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.

In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability.  It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits.  It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.

[Changelog by PaulMck]
Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

47933ad4

x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround · 26bef131

由 Linus Torvalds 提交于 1月 11, 2014

Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.
Reported-by: Nhalfdog <me@halfdog.net>
Tested-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

26bef131

09 1月, 2014 1 次提交

arch: x86: New MailBox support driver for Intel SOC's · 46184415

由 David E. Box 提交于 1月 08, 2014

Current Intel SOC cores use a MailBox Interface (MBI) to provide access to
configuration registers on devices (called units) connected to the system
fabric. This is a support driver that implements access to this interface on
those platforms that can enumerate the device using PCI. Initial support is for
BayTrail, for which port definitons are provided. This is a requirement for
implementing platform specific features (e.g. RAPL driver requires this to
perform platform specific power management using the registers in PUNIT).
Dependant modules should select IOSF_MBI in their respective Kconfig
configuraiton. Serialized access is handled by all exported routines with
spinlocks.

The API includes 3 functions for access to unit registers:

int iosf_mbi_read(u8 port, u8 opcode, u32 offset, u32 *mdr)
int iosf_mbi_write(u8 port, u8 opcode, u32 offset, u32 mdr)
int iosf_mbi_modify(u8 port, u8 opcode, u32 offset, u32 mdr, u32 mask)

port:	indicating the unit being accessed
opcode:	the read or write port specific opcode
offset:	the register offset within the port
mdr:	the register data to be read, written, or modified
mask:	bit locations in mdr to change

Returns nonzero on error

Note: GPU code handles access to the GFX unit. Therefore access to that unit
with this driver is disallowed to avoid conflicts.
Signed-off-by: NDavid E. Box <david.e.box@linux.intel.com>
Link: http://lkml.kernel.org/r/1389216471-734-1-git-send-email-david.e.box@linux.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>

46184415

07 1月, 2014 1 次提交

x86: Delete non-required instances of include <linux/init.h> · 663b55b9

由 Paul Gortmaker 提交于 1月 06, 2014

None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

[ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ]
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

663b55b9

05 1月, 2014 1 次提交

x86, sparse: Do not force removal of __user when calling copy_to/from_user_nocheck() · df90ca96

由 Steven Rostedt 提交于 1月 03, 2014

Commit ff47ab4f "x86: Add 1/2/4/8 byte optimization to 64bit
__copy_{from,to}_user_inatomic" added a "_nocheck" call in between
the copy_to/from_user() and copy_user_generic(). As both the
normal and nocheck versions of theses calls use the proper __user
annotation, a typecast to remove it should not be added.
This causes sparse to spin out the following warnings:

arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47: expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47: got void const *<noident>
arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47: expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47: got void const *<noident>
arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47: expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47: got void const *<noident>
arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47: expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47: got void const *<noident>

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20140103164500.5f6478f5@gandalf.local.homeSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

df90ca96

04 1月, 2014 1 次提交

x86, cpu: Detect more TLB configuration · dd360393

由 Kirill A. Shutemov 提交于 12月 23, 2013

The Intel Software Developer’s Manual covers few more TLB
configurations exposed as CPUID 2 descriptors:

61H Instruction TLB: 4 KByte pages, fully associative, 48 entries
63H Data TLB: 1 GByte pages, 4-way set associative, 4 entries
76H Instruction TLB: 2M/4M pages, fully associative, 8 entries
B5H Instruction TLB: 4KByte pages, 8-way set associative, 64 entries
B6H Instruction TLB: 4KByte pages, 8-way set associative, 128 entries
C1H Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries
C2H DTLB DTLB: 2 MByte/$MByte pages, 4-way associative, 16 entries

Let's detect them as well.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/r/1387801018-14499-1-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

dd360393

03 1月, 2014 1 次提交

x86/efi: parse_efi_setup() build fix · 5c12af0c

由 Dave Young 提交于 1月 03, 2014

In case without CONFIG_EFI, there will be below build error:

   arch/x86/built-in.o: In function `setup_arch':
  (.init.text+0x9dc): undefined reference to `parse_efi_setup'

Thus fix it by adding blank inline function in asm/efi.h
Also remove an unused declaration for variable efi_data_len.
Signed-off-by: NDave Young <dyoung@redhat.com>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>

5c12af0c

29 12月, 2013 1 次提交

x86/efi: Pass necessary EFI data for kexec via setup_data · 1fec0533

由 Dave Young 提交于 12月 20, 2013

Add a new setup_data type SETUP_EFI for kexec use.  Passing the saved
fw_vendor, runtime, config tables and EFI runtime mappings.

When entering virtual mode, directly mapping the EFI runtime regions
which we passed in previously. And skip the step to call
SetVirtualAddressMap().

Specially for HP z420 workstation we need save the smbios physical
address.  The kernel boot sequence proceeds in the following order.
Step 2 requires efi.smbios to be the physical address.  However, I found
that on HP z420 EFI system table has a virtual address of SMBIOS in step
1.  Hence, we need set it back to the physical address with the smbios
in efi_setup_data.  (When it is still the physical address, it simply
sets the same value.)

1. efi_init() - Set efi.smbios from EFI system table
2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table
3. efi_enter_virtual_mode() - Map EFI ranges

Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an
HP z420 workstation.
Signed-off-by: NDave Young <dyoung@redhat.com>
Tested-by: NToshi Kani <toshi.kani@hp.com>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>

1fec0533

28 12月, 2013 2 次提交

x86: Slightly tweak the access_ok() C variant for better code · a740576a

由 H. Peter Anvin 提交于 12月 27, 2013

gcc can under very specific circumstances realize that the code
sequence:

	foo += bar;
	if (foo < bar) ...

... is equivalent to a carry out from the addition.  Tweak the
implementation of access_ok() (specifically __chk_range_not_ok()) to
make it more likely that gcc will make that connection.  It isn't
fool-proof (sometimes gcc seems to think it can make better code with
lea, and ends up with a second comparison), still, but it seems to be
able to connect the two more frequently this way.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/CA%2B55aFzPBdbfKovMT8Edr4SmE2_=%2BOKJFac9XW2awegogTkVTA@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

a740576a

x86: Replace assembly access_ok() with a C variant · c5fe5d80

由 Linus Torvalds 提交于 12月 27, 2013

It turns out that the assembly variant doesn't actually produce that
good code, presumably partly because it creates a long dependency
chain with no scheduling, and partly because we cannot get a flags
result out of gcc (which could be fixed with asm goto, but it turns
out not to be worth it.)

The C code allows gcc to schedule and generate multiple (easily
predictable) branches, and as a side benefit we can really optimize
the case where the size is constant.

Link: http://lkml.kernel.org/r/CA%2B55aFzPBdbfKovMT8Edr4SmE2_=%2BOKJFac9XW2awegogTkVTA@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

c5fe5d80

21 12月, 2013 1 次提交

x86/efi: Add a wrapper function efi_map_region_fixed() · 3b266496

由 Dave Young 提交于 12月 20, 2013

Kexec kernel will use saved runtime virtual mapping, so add a new
function efi_map_region_fixed() for directly mapping a md to md->virt.

The md is passed in from 1st kernel, the virtual addr is saved in
md->virt_addr.
Signed-off-by: NDave Young <dyoung@redhat.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Tested-by: NToshi Kani <toshi.kani@hp.com>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>

3b266496

20 12月, 2013 2 次提交

x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers · 7e98b719

由 H. Peter Anvin 提交于 12月 19, 2013

Use static_cpu_has() to conditionalize the CLFLUSH workaround, and add
memory barriers around it since the documentation is explicit that
CLFLUSH is only ordered with respect to MFENCE.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/CA%2B55aFzGxcML7j8CEvQPYzh0W81uVoAAVmGctMOUZ7CZ1yYd2A@mail.gmail.com

7e98b719

x86, acpi, idle: Restructure the mwait idle routines · 16824255

由 Peter Zijlstra 提交于 12月 12, 2013

People seem to delight in writing wrong and broken mwait idle routines;
collapse the lot.

This leaves mwait_play_dead() the sole remaining user of __mwait() and
new __mwait() users are probably doing it wrong.

Also remove __sti_mwait() as its unused.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jacob Jun Pan <jacob.jun.pan@linux.intel.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Len Brown <lenb@kernel.org>
Cc: Rui Zhang <rui.zhang@intel.com>
Acked-by: NRafael Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131212141654.616820819@infradead.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

16824255

19 12月, 2013 1 次提交

mm: fix TLB flush race between migration, and change_protection_range · 20841405

由 Rik van Riel 提交于 12月 18, 2013

There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.

This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).

The basic race looks like this:

CPU A			CPU B			CPU C

						load TLB entry
make entry PTE/PMD_NUMA
			fault on entry
						read/write old page
			start migrating page
			change PTE/PMD to new page
						read/write old page [*]
flush TLB
						reload TLB from new entry
						read/write new page
						lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

[mgorman@suse.de: fix build]
Signed-off-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NMel Gorman <mgorman@suse.de>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

20841405

11 12月, 2013 1 次提交

sched: Remove PREEMPT_NEED_RESCHED from generic code · ba1f14fb

由 Peter Zijlstra 提交于 11月 28, 2013

While hunting a preemption issue with Alexander, Ben noticed that the
currently generic PREEMPT_NEED_RESCHED stuff is horribly broken for
load-store architectures.

We currently rely on the IPI to fold TIF_NEED_RESCHED into
PREEMPT_NEED_RESCHED, but when this IPI lands while we already have
a load for the preempt-count but before the store, the store will erase
the PREEMPT_NEED_RESCHED change.

The current preempt-count only works on load-store archs because
interrupts are assumed to be completely balanced wrt their preempt_count
fiddling; the previous preempt_count load will match the preempt_count
state after the interrupt and therefore nothing gets lost.

This patch removes the PREEMPT_NEED_RESCHED usage from generic code and
pushes it into x86 arch code; the generic code goes back to relying on
TIF_NEED_RESCHED.

Boot tested on x86_64 and compile tested on ppc64.
Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-and-Tested-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131128132641.GP10022@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

ba1f14fb

05 12月, 2013 1 次提交

x86, bitops: Correct the assembly constraints to testing bitops · e0f6dec3

由 H. Peter Anvin 提交于 12月 04, 2013

In checkin:

0c44c2d0 x86: Use asm goto to implement better modify_and_test() functions

the various functions which do modify and test were unified and
optimized using "asm goto".  However, this change missed the detail
that the bitops require an "Ir" constraint rather than an "er"
constraint ("I" = integer constant from 0-31, "e" = signed 32-bit
integer constant).  This would cause code to miscompile if these
functions were used on constant bit positions 32-255 and the build to
fail if used on constant bit positions above 255.

Add the constraints as a parameter to the GEN_BINARY_RMWcc() macro to
avoid this problem.
Reported-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/529E8719.4070202@zytor.com

e0f6dec3

19 11月, 2013 3 次提交

ftrace, perf: Avoid infinite event generation loop · d5b5f391

由 Peter Zijlstra 提交于 11月 14, 2013

Vince's perf-trinity fuzzer found yet another 'interesting' problem.

When we sample the irq_work_exit tracepoint with period==1 (or
PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an
infinite event generation loop:

  ,-> <IPI>
  |     irq_work_exit() ->
  |       trace_irq_work_exit() ->
  |         ...
  |           __perf_event_overflow() -> (due to fasync)
  |             irq_work_queue() -> (irq_work_list must be empty)
  '---------      arch_irq_work_raise()

Similar things can happen due to regular poll() wakeups if we exceed
the ring-buffer wakeup watermark, or have an event_limit.

To avoid this, dis-allow sampling this particular tracepoint.

In order to achieve this, create a special perf_perm function pointer
for each event and call this (when set) on trying to create a
tracepoint perf event.

[ roasted: use expr... to allow for ',' in your expression ]
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Tested-by: NVince Weaver <vincent.weaver@maine.edu>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

d5b5f391

x86/mm: Implement ASLR for hugetlb mappings · fd8526ad

由 Kirill A. Shutemov 提交于 11月 19, 2013

Matthew noticed that hugetlb mappings don't participate in ASLR on x86-64:

  %  for i in `seq 3`; do
  > tools/testing/selftests/vm/map_hugetlb | grep address
  > done
  Returned address is 0x2aaaaac00000
  Returned address is 0x2aaaaac00000
  Returned address is 0x2aaaaac00000

/proc/PID/maps entries for the mapping are always the same
(except inode number):

  2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 8200              /anon_hugepage (deleted)
  2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 256               /anon_hugepage (deleted)
  2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 7180              /anon_hugepage (deleted)

The reason is the generic hugetlb_get_unmapped_area() function
which is used on x86-64.  It doesn't support randomization and
use bottom-up unmapped area lookup, instead of usual top-down
on x86-64.

x86 has arch-specific hugetlb_get_unmapped_area(), but it's used
only on x86-32.

Let's use arch-specific hugetlb_get_unmapped_area() on x86-64
too. That adds ASLR and switches hugetlb mappings to use top-down
unmapped area lookup:

  %  for i in `seq 3`; do
  > tools/testing/selftests/vm/map_hugetlb | grep address
  > done
  Returned address is 0x7f4f08a00000
  Returned address is 0x7fdda4200000
  Returned address is 0x7febe0000000

/proc/PID/maps entries:

  7f4f08a00000-7f4f18a00000 rw-p 00000000 00:0c 1168              /anon_hugepage (deleted)
  7fdda4200000-7fddb4200000 rw-p 00000000 00:0c 7092              /anon_hugepage (deleted)
  7febe0000000-7febf0000000 rw-p 00000000 00:0c 7183              /anon_hugepage (deleted)

Unmapped area lookup policy for hugetlb mappings is consistent
with normal mappings now -- the only difference is alignment
requirements for huge pages.

libhugetlbfs test-suite didn't detect any regressions with the
patch applied (although it shows few failures on my machine
regardless the patch).
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mel Gorman <mgorman@suse.de>
Link: http://lkml.kernel.org/r/20131119131750.EA45CE0090@blue.fi.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fd8526ad

x86/mm: Unify pte_to_pgoff() and pgoff_to_pte() helpers · 5305ca10

由 Cyrill Gorcunov 提交于 11月 15, 2013

Use unified pte_bitop() helper to manipulate bits in pte/pgoff
bitfields and convert pte_to_pgoff()/pgoff_to_pte() to inlines.
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

5305ca10

15 11月, 2013 2 次提交

x86, mm: enable split page table lock for PMD level · 9491846f

由 Kirill A. Shutemov 提交于 11月 14, 2013

Enable PMD split page table lock for X86_64 and PAE.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: NAlex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9491846f

ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node · 7b199811

由 Rafael J. Wysocki 提交于 11月 11, 2013

Modify struct acpi_dev_node to contain a pointer to struct acpi_device
associated with the given device object (that is, its ACPI companion
device) instead of an ACPI handle corresponding to it.  Introduce two
new macros for manipulating that pointer in a CONFIG_ACPI-safe way,
ACPI_COMPANION() and ACPI_COMPANION_SET(), and rework the
ACPI_HANDLE() macro to take the above changes into account.
Drop the ACPI_HANDLE_SET() macro entirely and rework its users to
use ACPI_COMPANION_SET() instead.  For some of them who used to
pass the result of acpi_get_child() directly to ACPI_HANDLE_SET()
introduce a helper routine acpi_preset_companion() doing an
equivalent thing.

The main motivation for doing this is that there are things
represented by struct acpi_device objects that don't have valid
ACPI handles (so called fixed ACPI hardware features, such as
power and sleep buttons) and we would like to create platform
device objects for them and "glue" them to their ACPI companions
in the usual way (which currently is impossible due to the
lack of valid ACPI handles).  However, there are more reasons
why it may be useful.

First, struct acpi_device pointers allow of much better type checking
than void pointers which are ACPI handles, so it should be more
difficult to write buggy code using modified struct acpi_dev_node
and the new macros.  Second, the change should help to reduce (over
time) the number of places in which the result of ACPI_HANDLE() is
passed to acpi_bus_get_device() in order to obtain a pointer to the
struct acpi_device associated with the given "physical" device,
because now that pointer is returned by ACPI_COMPANION() directly.
Finally, the change should make it easier to write generic code that
will build both for CONFIG_ACPI set and unset without adding explicit
compiler directives to it.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> # on Haswell
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com> # for ATA and SDIO part

7b199811

14 11月, 2013 1 次提交

preempt: Make PREEMPT_ACTIVE generic · 00d1a39e

由 Thomas Gleixner 提交于 9月 17, 2013

No point in having this bit defined by architecture.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130917183629.090698799@linutronix.de

00d1a39e

13 11月, 2013 2 次提交

x86: move fpu_counter into ARCH specific thread_struct · c375f15a

由 Vineet Gupta 提交于 11月 12, 2013

Only a couple of arches (sh/x86) use fpu_counter in task_struct so it can
be moved out into ARCH specific thread_struct, reducing the size of
task_struct for other arches.

Compile tested i386_defconfig + gcc 4.7.3
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Paul Mundt <paul.mundt@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c375f15a

x86/dumpstack: Fix printk_address for direct addresses · 5f01c988

由 Jiri Slaby 提交于 10月 25, 2013

Consider a kernel crash in a module, simulated the following way:

 static int my_init(void)
 {
         char *map = (void *)0x5;
         *map = 3;
         return 0;
 }
 module_init(my_init);

When we turn off FRAME_POINTERs, the very first instruction in
that function causes a BUG. The problem is that we print IP in
the BUG report using %pB (from printk_address). And %pB
decrements the pointer by one to fix printing addresses of
functions with tail calls.

This was added in commit 71f9e598 ("x86, dumpstack: Use
%pB format specifier for stack trace") to fix the call stack
printouts.

So instead of correct output:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000005
  IP: [<ffffffffa01ac000>] my_init+0x0/0x10 [pb173]

We get:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000005
  IP: [<ffffffffa0152000>] 0xffffffffa0151fff

To fix that, we use %pS only for stack addresses printouts (via
newly added printk_stack_address) and %pB for regs->ip (via
printk_address). I.e. we revert to the old behaviour for all
except call stacks. And since from all those reliable is 1, we
remove that parameter from printk_address.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: joe@perches.com
Cc: jirislaby@gmail.com
Link: http://lkml.kernel.org/r/1382706418-8435-1-git-send-email-jslaby@suse.czSigned-off-by: NIngo Molnar <mingo@kernel.org>

5f01c988

12 11月, 2013 2 次提交

Revert "x86/UV: Add uvtrace support" · b5dfcb09

由 Ingo Molnar 提交于 11月 11, 2013

This reverts commit 8eba1842.

uv_trace() is not used by anything, nor is uv_trace_nmi_func, nor
uv_trace_func.

That's not how we do instrumentation code in the kernel: we add
tracepoints, printk()s, etc. so that everyone not just those with
magic kernel modules can debug a system.

So remove this unused (and misguied) piece of code.
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Cc: Mike Travis <travis@sgi.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/n/tip-tumfBffmr4jmnt8Gyxanoblg@git.kernel.org

b5dfcb09

x86, trace: Change user|kernel_page_fault to page_fault_user|kernel · a4f61dec

由 H. Peter Anvin 提交于 11月 11, 2013

Tracepoints are named hierachially, and it makes more sense to keep a
general flow of information level from general to specific from left
to right, i.e.

	x86_exceptions.page_fault_user|kernel

rather than

	x86_exceptions.user|kernel_page_fault
Suggested-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NSeiji Aguchi <seiji.aguchi@hds.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/20131111082955.GB12405@gmail.com

a4f61dec

09 11月, 2013 2 次提交

x86, trace: Add page fault tracepoints · d34603b0

由 Seiji Aguchi 提交于 10月 30, 2013

This patch introduces page fault tracepoints to x86 architecture
by switching IDT.

  Two events, for user and kernel spaces, are introduced at the beginning
  of page fault handler for tracing.

  - User space event
    There is a request of page fault event for user space as below.

    https://lkml.kernel.org/r/1368079520-11015-2-git-send-email-fdeslaur+()+gmail+!+com
    https://lkml.kernel.org/r/1368079520-11015-1-git-send-email-fdeslaur+()+gmail+!+com

  - Kernel space event:
    When we measure an overhead in kernel space for investigating performance
    issues, we can check if it comes from the page fault events.
Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/52716E67.6090705@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

d34603b0

x86, trace: Delete __trace_alloc_intr_gate() · ac7956e2

由 Seiji Aguchi 提交于 10月 30, 2013

Currently irq vector handlers for tracing are registered in both set_intr_gate()
and __trace_alloc_intr_gate() in alloc_intr_gate().
But, we don't need to do that twice.
So, let's delete __trace_alloc_intr_gate().
Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/52716E1B.7090205@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

ac7956e2

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功