提交 · 5034086b72e4e2d42f0db4b4ebb0fe0129ebdeae · openeuler / raspberrypi-kernel

24 10月, 2011 1 次提交

由 Takashi Iwai 提交于 10月 23, 2011

Commit 4b239f45 ("x86-64, mm: Put early page table high") causes a S4
regression since 2.6.39, namely the machine reboots occasionally at S4
resume.  It doesn't happen always, overall rate is about 1/20.  But,
like other bugs, once when this happens, it continues to happen.

This patch fixes the problem by essentially reverting the memory
assignment in the older way.
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Cc: <stable@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yinghai.lu@oracle.com>
[ We'll hopefully find the real fix, but that's too late for 3.1 now ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8548c84d

14 10月, 2011 1 次提交

x86, mrst: use a temporary variable for SFI irq · 153b19a3

由 Mika Westerberg 提交于 10月 13, 2011

SFI tables reside in RAM and should not be modified once they are
written.  Current code went to set pentry->irq to zero which causes
subsequent reads to fail with invalid SFI table checksum.  This will
break kexec as the second kernel fails to validate SFI tables.

To fix this we use temporary variable for irq number.
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

153b19a3

11 10月, 2011 1 次提交

x86: Default to vsyscall=native for now · 2b666859

由 Adrian Bunk 提交于 10月 06, 2011

This UML breakage:

linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790
linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790

Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
vsyscall= parameter") - the vsyscall emulation code is not fully cooked
yet as UML relies on some rather fragile SIGSEGV semantics.

Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
to vsyscall=native for now, this patch implements that.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Acked-by: NAndrew Lutomirski <luto@mit.edu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fiSigned-off-by: NIngo Molnar <mingo@elte.hu>

2b666859

07 10月, 2011 1 次提交

x86/PCI: use host bridge _CRS info on ASUS M2V-MX SE · 29cf7a30

由 Paul Menzel 提交于 8月 31, 2011

In summary, this DMI quirk uses the _CRS info by default for the ASUS
M2V-MX SE by turning on `pci=use_crs` and is similar to the quirk
added by commit 2491762c ("x86/PCI: use host bridge _CRS info on
ASRock ALiveSATA2-GLAN") whose commit message should be read for further
information.

Since commit 3e3da00c ("x86/pci: AMD one chain system to use pci
read out res") Linux gives the following oops:

    parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
    HDA Intel 0000:20:01.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
    HDA Intel 0000:20:01.0: setting latency timer to 64
    BUG: unable to handle kernel paging request at ffffc90011c08000
    IP: [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
    PGD 13781a067 PUD 13781b067 PMD 1300ba067 PTE 800000fd00000173
    Oops: 0009 [#1] SMP
    last sysfs file: /sys/module/snd_pcm/initstate
    CPU 0
    Modules linked in: snd_hda_intel(+) snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event tpm_tis tpm snd_seq tpm_bios psmouse parport_pc snd_timer snd_seq_device parport processor evdev snd i2c_viapro thermal_sys amd64_edac_mod k8temp i2c_core soundcore shpchp pcspkr serio_raw asus_atk0110 pci_hotplug edac_core button snd_page_alloc edac_mce_amd ext3 jbd mbcache sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif sr_mod cdrom ata_generic uhci_hcd sata_via pata_via libata ehci_hcd usbcore scsi_mod via_rhine mii nls_base [last unloaded: scsi_wait_scan]
    Pid: 1153, comm: work_for_cpu Not tainted 2.6.37-1-amd64 #1 M2V-MX SE/System Product Name
    RIP: 0010:[<ffffffffa0578402>]  [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
    RSP: 0018:ffff88013153fe50  EFLAGS: 00010286
    RAX: ffffc90011c08000 RBX: ffff88013029ec00 RCX: 0000000000000006
    RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ffff88013341d000 R08: 0000000000000000 R09: 0000000000000040
    R10: 0000000000000286 R11: 0000000000003731 R12: ffff88013029c400
    R13: 0000000000000000 R14: 0000000000000000 R15: ffff88013341d090
    FS:  0000000000000000(0000) GS:ffff8800bfc00000(0000) knlGS:00000000f7610ab0
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: ffffc90011c08000 CR3: 0000000132f57000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process work_for_cpu (pid: 1153, threadinfo ffff88013153e000, task ffff8801303c86c0)
    Stack:
     0000000000000005 ffffffff8123ad65 00000000000136c0 ffff88013029c400
     ffff8801303c8998 ffff88013341d000 ffff88013341d090 ffff8801322d9dc8
     ffff88013341d208 0000000000000000 0000000000000000 ffffffff811ad232
    Call Trace:
     [<ffffffff8123ad65>] ? __pm_runtime_set_status+0x162/0x186
     [<ffffffff811ad232>] ? local_pci_probe+0x49/0x92
     [<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
     [<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
     [<ffffffff8105afd0>] ? do_work_for_cpu+0xb/0x1b
     [<ffffffff8105fd3f>] ? kthread+0x7a/0x82
     [<ffffffff8100a824>] ? kernel_thread_helper+0x4/0x10
     [<ffffffff8105fcc5>] ? kthread+0x0/0x82
     [<ffffffff8100a820>] ? kernel_thread_helper+0x0/0x10
    Code: f4 01 00 00 ef 31 f6 48 89 df e8 29 dd ff ff 85 c0 0f 88 2b 03 00 00 48 89 ef e8 b4 39 c3 e0 8b 7b 40 e8 fc 9d b1 e0 48 8b 43 38 <66> 8b 10 66 89 14 24 8b 43 14 83 e8 03 83 f8 01 77 32 31 d2 be
    RIP  [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
     RSP <ffff88013153fe50>
    CR2: ffffc90011c08000
    ---[ end trace 8d1f3ebc136437fd ]---

Trusting the ACPI _CRS information (`pci=use_crs`) fixes this problem.

    $ dmesg | grep -i crs # with the quirk
    PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug

The match has to be against the DMI board entries though since the vendor entries are not populated.

    DMI: System manufacturer System Product Name/M2V-MX SE, BIOS 0304    10/30/2007

This quirk should be removed when `pci=use_crs` is enabled for machines
from 2006 or earlier or some other solution is implemented.

Using coreboot [1] with this board the problem does not exist but this
quirk also does not affect it either. To be safe though the check is
tightened to only take effect when the BIOS from American Megatrends is
used.

        15:13 < ruik> but coreboot does not need that
        15:13 < ruik> because i have there only one root bus
        15:13 < ruik> the audio is behind a bridge

        $ sudo dmidecode
        BIOS Information
                Vendor: American Megatrends Inc.
                Version: 0304
                Release Date: 10/30/2007

[1] http://www.coreboot.org/

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=30552

Cc: stable@kernel.org (2.6.34)
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: NPaul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

29cf7a30

26 9月, 2011 2 次提交

KVM: x86 emulator: fix Src2CL decode · 9be3be1f

由 Avi Kivity 提交于 9月 13, 2011

Src2CL decode (used for double width shifts) erronously decodes only bit 3
of %rcx, instead of bits 7:0.

Fix by decoding %cl in its entirety.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9be3be1f

KVM: MMU: fix incorrect return of spte · 41bc3186

由 Zhao Jin 提交于 9月 19, 2011

__update_clear_spte_slow should return original spte while the
current code returns low half of original spte combined with high
half of new spte.
Signed-off-by: NZhao Jin <cronozhj@gmail.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

41bc3186

21 9月, 2011 1 次提交

x86/rtc: Don't recursively acquire rtc_lock · 47997d75

由 Matt Fleming 提交于 9月 21, 2011

A deadlock was introduced on x86 in commit ef68c8f8 ("x86:
Serialize EFI time accesses on rtc_lock") because efi_get_time()
and friends can be called with rtc_lock already held by
read_persistent_time(), e.g.:

 timekeeping_init()
    read_persistent_clock()     <-- acquire rtc_lock
        efi_get_time()
            phys_efi_get_time() <-- acquire rtc_lock <DEADLOCK>

To fix this let's push the locking down into the get_wallclock()
and set_wallclock() implementations.  Only the clock
implementations that access the x86 RTC directly need to acquire
rtc_lock, so it makes sense to push the locking down into the
rtc, vrtc and efi code.

The virtualization implementations don't require rtc_lock to be
held because they provide their own serialization.
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Acked-by: NJan Beulich <jbeulich@novell.com>
Acked-by: Avi Kivity <avi@redhat.com> [for the virtualization aspect]
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

47997d75

16 9月, 2011 1 次提交

asm alternatives: remove incorrect alignment notes · a7f934d4

由 Linus Torvalds 提交于 9月 15, 2011

On x86-64, they were just wasteful: with the explicitly added (now
unnecessary) padding, the size of the alternatives structure was 16
bytes, and an alignment of 8 bytes didn't hurt much.

However, it was still silly, since the natural size and alignment for
the structure is actually just 12 bytes, 4-byte aligned since commit
59e97e4d ("x86: Make alternative instruction pointers relative").
So removing the padding, and removing the extra alignment is just a good
idea.

On x86-32, the alignment of 4 bytes was correct, but was incorrectly
hardcoded as 8 bytes in <asm/alternative-asm.h>.  That header file had
used to be an x86-64 only header file, but various unification efforts
have made it be used for x86-32 too (ie the unification of rwlock and
rwsem).

That in turn caused x86-32 boot failures, because the extra alignment
would result in random zero-filled words in the altinstructions section,
causing oopses early at boot when doing alternative instruction
replacement.

So just remove all the alignment noise entirely.  It's wrong, and it's
unnecessary.  The section itself is already properly aligned by the
linker scripts, and all additions to the section had better be of the
proper 12-byte format, keeping it aligned.  So if the align directive
were to ever make a difference, that would be an indication of a serious
bug to begin with.
Reported-by: NWerner Landgraf <w.landgraf@ru.r>
Acked-by: NAndrew Lutomirski <luto@mit.edu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7f934d4

15 9月, 2011 1 次提交

xen/i386: follow-up to "replace order-based range checking of M2P table by linear one" · 61cca2fa

由 Jan Beulich 提交于 9月 15, 2011

The numbers obtained from the hypervisor really can't ever lead to an
overflow here, only the original calculation going through the order
of the range could have. This avoids the (as Jeremy points outs)
somewhat ugly NULL-based calculation here.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

61cca2fa

13 9月, 2011 1 次提交

xen/e820: if there is no dom0_mem=, don't tweak extra_pages. · e3b73c4a

由 David Vrabel 提交于 9月 13, 2011

The patch "xen: use maximum reservation to limit amount of usable RAM"
(d312ae87) breaks machines that
do not use 'dom0_mem=' argument with:

reserve RAM buffer: 000000133f2e2000 - 000000133fffffff
(XEN) mm.c:4976:d0 Global bit is set to kernel page fffff8117e
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...

The reason being that the last E820 entry is created using the
'extra_pages' (which is based on how many pages have been freed).
The mentioned git commit sets the initial value of 'extra_pages'
using a hypercall which returns the number of pages (if dom0_mem
has been used) or -1 otherwise. If the later we return with
MAX_DOMAIN_PAGES as basis for calculation:

    return min(max_pages, MAX_DOMAIN_PAGES);

and use it:

     extra_limit = xen_get_max_pages();
     if (extra_limit >= max_pfn)
             extra_pages = extra_limit - max_pfn;
     else
             extra_pages = 0;

which means we end up with extra_pages = 128GB in PFNs (33554432)
- 8GB in PFNs (2097152, on this specific box, can be larger or smaller),
and then we add that value to the E820 making it:

  Xen: 00000000ff000000 - 0000000100000000 (reserved)
  Xen: 0000000100000000 - 000000133f2e2000 (usable)

which is clearly wrong. It should look as so:

  Xen: 00000000ff000000 - 0000000100000000 (reserved)
  Xen: 0000000100000000 - 000000027fbda000 (usable)

Naturally this problem does not present itself if dom0_mem=max:X
is used.

CC: stable@kernel.org
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e3b73c4a

10 9月, 2011 1 次提交

Fix pointer dereference before call to pcie_bus_configure_settings · 5307f6d5

由 Shyam Iyer 提交于 9月 08, 2011

Commit b03e7495 ("PCI: Set PCI-E Max Payload Size on fabric")
introduced a potential NULL pointer dereference in calls to
pcie_bus_configure_settings due to attempts to access pci_bus self
variables when the self pointer is NULL.

To correct this, verify that the self pointer in pci_bus is non-NULL
before dereferencing it.
Reported-by: NStanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: NShyam Iyer <shyam_iyer@dell.com>
Signed-off-by: NJon Mason <mason@myri.com>
Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5307f6d5

09 9月, 2011 1 次提交

xen: disable PV spinlocks on HVM · f10cd522

由 Stefano Stabellini 提交于 9月 06, 2011

PV spinlocks cannot possibly work with the current code because they are
enabled after pvops patching has already been done, and because PV
spinlocks use a different data structure than native spinlocks so we
cannot switch between them dynamically. A spinlock that has been taken
once by the native code (__ticket_spin_lock) cannot be taken by
__xen_spin_lock even after it has been released.
Reported-and-Tested-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

f10cd522

02 9月, 2011 2 次提交

xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead. · ed467e69

由 Konrad Rzeszutek Wilk 提交于 9月 01, 2011

We have hit a couple of customer bugs where they would like to
use those parameters to run an UP kernel - but both of those
options turn of important sources of interrupt information so
we end up not being able to boot. The correct way is to
pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and
the kernel will patch itself to be a UP kernel.

Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308

CC: stable@kernel.org
Acked-by: NIan Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ed467e69

xen: x86_32: do not enable iterrupts when returning from exception in interrupt context · d198d499

由 Igor Mammedov 提交于 9月 01, 2011

If vmalloc page_fault happens inside of interrupt handler with interrupts
disabled then on exit path from exception handler when there is no pending
interrupts, the following code (arch/x86/xen/xen-asm_32.S:112):

	cmpw $0x0001, XEN_vcpu_info_pending(%eax)
	sete XEN_vcpu_info_mask(%eax)

will enable interrupts even if they has been previously disabled according to
eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99)

	testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp)
	setz XEN_vcpu_info_mask(%eax)

Solution is in setting XEN_vcpu_info_mask only when it should be set
according to
	cmpw $0x0001, XEN_vcpu_info_pending(%eax)
but not clearing it if there isn't any pending events.

Reproducer for bug is attached to RHBZ 707552

CC: stable@kernel.org
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d198d499

01 9月, 2011 1 次提交

xen: use maximum reservation to limit amount of usable RAM · d312ae87

由 David Vrabel 提交于 8月 19, 2011

Use the domain's maximum reservation to limit the amount of extra RAM
for the memory balloon. This reduces the size of the pages tables and
the amount of reserved low memory (which defaults to about 1/32 of the
total RAM).

On a system with 8 GiB of RAM with the domain limited to 1 GiB the
kernel reports:

Before:

Memory: 627792k/4472000k available

After:

Memory: 549740k/11132224k available

A increase of about 76 MiB (~1.5% of the unused 7 GiB).  The reserved
low memory is also reduced from 253 MiB to 32 MiB.  The total
additional usable RAM is 329 MiB.

For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
the number of pages for dom0') (c/s 23790)

CC: stable@kernel.org
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d312ae87

31 8月, 2011 1 次提交

x86, perf: Check that current->mm is alive before getting user callchain · 20afc60f

由 Andrey Vagin 提交于 8月 30, 2011

An event may occur when an mm is already released.

I added an event in dequeue_entity() and caught a panic with
the following backtrace:

[  434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[  434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120
...
[  434.421258] Call Trace:
[  434.421258]  [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0
[  434.421258]  [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90
[  434.421258]  [<ffffffff8101b048>] perf_callchain_user+0x128/0x170
[  434.421258]  [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100
[  434.421258]  [<ffffffff81116690>] perf_prepare_sample+0x200/0x280
[  434.421258]  [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290
[  434.421258]  [<ffffffff81065240>] ? tg_shares_up+0x0/0x670
[  434.421258]  [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0
[  434.421258]  [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0
[  434.421258]  [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250
[  434.421258]  [<ffffffff81119204>] perf_tp_event+0x44/0x70
[  434.421258]  [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110
[  434.421258]  [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0
[  434.421258]  [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60
[  434.421258]  [<ffffffff8105818a>] dequeue_task+0x9a/0xb0
[  434.421258]  [<ffffffff810581e2>] deactivate_task+0x42/0xe0
[  434.421258]  [<ffffffff814bc019>] thread_return+0x191/0x808
[  434.421258]  [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60
[  434.421258]  [<ffffffff8106f4c4>] do_exit+0x464/0x910
[  434.421258]  [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0
[  434.421258]  [<ffffffff8106fa57>] sys_exit_group+0x17/0x20
[  434.421258]  [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

20afc60f

30 8月, 2011 1 次提交

KVM: Fix instruction size issue in pvclock scaling · 3b217116

由 Duncan Sands 提交于 8月 30, 2011

Commit de2d1a52 ("KVM: Fix register corruption in pvclock_scale_delta")
introduced a mul instruction that may have only a memory operand; the
assembler therefore cannot select the correct size:

   pvclock.s:229: Error: no instruction mnemonic suffix given and no register
operands; can't size instruction

In this example the assembler is:

         #APP
         mul -48(%rbp) ; shrd $32, %rdx, %rax
         #NO_APP

A simple solution is to use mulq.
Signed-off-by: NDuncan Sands <baldrick@free.fr>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3b217116

27 8月, 2011 2 次提交

All Arch: remove linkage for sys_nfsservctl system call · f5b94099

由 NeilBrown 提交于 8月 26, 2011

The nfsservctl system call is now gone, so we should remove all
linkage for it.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5b94099

sfi: table irq 0xFF means 'no interrupt' · a94cc4e6

由 Kirill A. Shutemov 提交于 8月 26, 2011

According to the SFI specification irq number 0xFF means device has no
interrupt or interrupt attached via GPIO.

Currently, we don't handle this special case and set irq field in
*_board_info structs to 255.  It leads to confusion in some drivers.
Accelerometer driver tries to register interrupt 255, fails and prints
"Cannot get IRQ" to dmesg.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a94cc4e6

26 8月, 2011 2 次提交

x86-32: Fix boot with CONFIG_X86_INVD_BUG · b4ca46e4

由 Andy Lutomirski 提交于 8月 25, 2011

entry_32.S contained a hardcoded alternative instruction entry, and the
format changed in commit 59e97e4d ("x86: Make alternative
instruction pointers relative").

Replace the hardcoded entry with the altinstruction_entry macro.  This
fixes the 32-bit boot with CONFIG_X86_INVD_BUG=y.
Reported-and-tested-by: NArnaud Lacombe <lacombar@gmail.com>
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b4ca46e4

mtrr: fix UP breakage caused during switch to stop_machine · cbbfa38f

由 Tejun Heo 提交于 8月 25, 2011

While removing custom rendezvous code and switching to stop_machine,
commit 192d8857 ("x86, mtrr: use stop_machine APIs for doing MTRR
rendezvous") completely dropped mtrr setting code on !CONFIG_SMP
breaking MTRR settting on UP.

Fix it by removing the incorrect CONFIG_SMP.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NAnders Eriksson <aeriksson@fastmail.fm>
Tested-and-acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cbbfa38f

25 8月, 2011 1 次提交

xen: use non-tracing preempt in xen_clocksource_read() · f1c39625

由 Jeremy Fitzhardinge 提交于 8月 24, 2011

The tracing code used sched_clock() to get tracing timestamps, which
ends up calling xen_clocksource_read().  xen_clocksource_read() must
disable preemption, but if preemption tracing is enabled, this results
in infinite recursion.

I've only noticed this when boot-time tracing tests are enabled, but it
seems like a generic bug.  It looks like it would also affect
kvm_clocksource_read().
Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>

f1c39625

24 8月, 2011 1 次提交

x86-32, vdso: On system call restart after SYSENTER, use int $0x80 · 7ca0758c

由 H. Peter Anvin 提交于 8月 22, 2011

When we enter a 32-bit system call via SYSENTER or SYSCALL, we shuffle
the arguments to match the int $0x80 calling convention.  This was
probably a design mistake, but it's what it is now.  This causes
errors if the system call as to be restarted.

For SYSENTER, we have to invoke the instruction from the vdso as the
return address is hardcoded.  Accordingly, we can simply replace the
jump in the vdso with an int $0x80 instruction and use the slower
entry point for a post-restart.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/CA%2B55aFztZ=r5wa0x26KJQxvZOaQq8s2v3u50wCyJcA-Sc4g8gQ@mail.gmail.com
Cc: <stable@kernel.org>

7ca0758c

22 8月, 2011 2 次提交

xen/tracing: Fix tracing config option properly · 60c5f08e

由 Jeremy Fitzhardinge 提交于 8月 11, 2011

Steven Rostedt says we should use CONFIG_EVENT_TRACING.

Cc:Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

60c5f08e

xen: Do not enable PV IPIs when vector callback not present · 3c05c4be

由 Stefano Stabellini 提交于 8月 17, 2011

Fix regression for HVM case on older (<4.1.1) hypervisors caused by

  commit 99bbb3a8
  Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
  Date:   Thu Dec 2 17:55:10 2010 +0000

    xen: PV on HVM: support PV spinlocks and IPIs

This change replaced the SMP operations with event based handlers without
taking into account that this only works when the hypervisor supports
callback vectors. This causes unexplainable hangs early on boot for
HVM guests with more than one CPU.

BugLink: http://bugs.launchpad.net/bugs/791850

CC: stable@kernel.org
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-and-Reported-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

3c05c4be

17 8月, 2011 2 次提交

xen/x86: replace order-based range checking of M2P table by linear one · ccbcdf7c

由 Jan Beulich 提交于 8月 16, 2011

The order-based approach is not only less efficient (requiring a shift
and a compare, typical generated code looking like this

	mov	eax, [machine_to_phys_order]
	mov	ecx, eax
	shr	ebx, cl
	test	ebx, ebx
	jnz	...

whereas a direct check requires just a compare, like in

	cmp	ebx, [machine_to_phys_nr]
	jae	...

), but also slightly dangerous in the 32-on-64 case - the element
address calculation can wrap if the next power of two boundary is
sufficiently far away from the actual upper limit of the table, and
hence can result in user space addresses being accessed (with it being
unknown what may actually be mapped there).

Additionally, the elimination of the mistaken use of fls() here (should
have been __fls()) fixes a latent issue on x86-64 that would trigger
if the code was run on a system with memory extending beyond the 44-bit
boundary.

CC: stable@kernel.org
Signed-off-by: NJan Beulich <jbeulich@novell.com>
[v1: Based on Jeremy's feedback]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ccbcdf7c

KVM: uses TASKSTATS, depends on NET · df3d8ae1

由 Randy Dunlap 提交于 8月 02, 2011

CONFIG_TASKSTATS just had a change to use netlink, including
a change to "depends on NET". Since "select" does not follow
dependencies, KVM also needs to depend on NET to prevent build
errors when CONFIG_NET is not enabled.

Sample of the reported "undefined reference" build errors:

taskstats.c:(.text+0x8f686): undefined reference to `nla_put'
taskstats.c:(.text+0x8f721): undefined reference to `nla_reserve'
taskstats.c:(.text+0x8f8fb): undefined reference to `init_net'
taskstats.c:(.text+0x8f905): undefined reference to `netlink_unicast'
taskstats.c:(.text+0x8f934): undefined reference to `kfree_skb'
taskstats.c:(.text+0x8f9e9): undefined reference to `skb_clone'
taskstats.c:(.text+0x90060): undefined reference to `__alloc_skb'
taskstats.c:(.text+0x901e9): undefined reference to `skb_put'
taskstats.c:(.init.text+0x4665): undefined reference to `genl_register_family'
taskstats.c:(.init.text+0x4699): undefined reference to `genl_register_ops'
taskstats.c:(.init.text+0x4710): undefined reference to `genl_unregister_ops'
taskstats.c:(.init.text+0x471c): undefined reference to `genl_unregister_family'
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

df3d8ae1

16 8月, 2011 1 次提交

x86: fix mm/fault.c build · cedf03bd

由 Randy Dunlap 提交于 8月 15, 2011

arch/x86/mm/fault.c needs to include asm/vsyscall.h to fix a
build error:

  arch/x86/mm/fault.c: In function '__bad_area_nosemaphore':
  arch/x86/mm/fault.c:728: error: 'VSYSCALL_START' undeclared (first use in this function)
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cedf03bd

11 8月, 2011 3 次提交

x86-64: Rework vsyscall emulation and add vsyscall= parameter · 3ae36655

由 Andy Lutomirski 提交于 8月 10, 2011

There are three choices:

vsyscall=native: Vsyscalls are native code that issues the
corresponding syscalls.

vsyscall=emulate (default): Vsyscalls are emulated by instruction
fault traps, tested in the bad_area path.  The actual contents of
the vsyscall page is the same as the vsyscall=native case except
that it's marked NX.  This way programs that make assumptions about
what the code in the page does will not be confused when they read
that code.

vsyscall=none: Trying to execute a vsyscall will segfault.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

3ae36655

x86-64: Wire up getcpu syscall · fce8dc06

由 Andy Lutomirski 提交于 8月 10, 2011

getcpu is available as a vdso entry and an emulated vsyscall.
Programs that for some reason don't want to use the vdso should
still be able to call getcpu without relying on the slow emulated
vsyscall. It costs almost nothing to expose it as a real syscall.

We also need this for the following patch in vsyscall=native mode.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/6b19f55bdb06a0c32c2fa6dba9b6f222e1fde999.1312988155.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

fce8dc06

x86: Remove unnecessary compile flag tweaks for vsyscall code · f3fb5b7b

由 Andy Lutomirski 提交于 8月 10, 2011

As of commit 98d0ac38
Author: Andy Lutomirski <luto@mit.edu>
Date:   Thu Jul 14 06:47:22 2011 -0400

    x86-64: Move vread_tsc and vread_hpet into the vDSO

user code no longer directly calls into code in arch/x86/kernel/, so
we don't need compile flag hacks to make it safe.  All vdso code is
in the vdso directory now.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/835cd05a4c7740544d09723d6ba48f4406f9826c.1312988155.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

f3fb5b7b

09 8月, 2011 1 次提交

perf, x86: Add model 45 SandyBridge support · a34668f6

由 Youquan Song 提交于 8月 02, 2011

Add support to Romely-EP SandyBridge.
Signed-off-by: NYouquan Song <youquan.song@intel.com>
Signed-off-by: NAnhua Xu <anhua.xu@intel.com>
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1312264895-2010-1-git-send-email-youquan.song@intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

a34668f6

06 8月, 2011 2 次提交

x86, UV: Remove UV delay in starting slave cpus · 05e33fc2

由 Jack Steiner 提交于 8月 05, 2011

Delete the 10 msec delay between the INIT and SIPI when starting
slave cpus. I can find no requirement for this delay. BIOS also
has similar code sequences without the delay.

Removing the delay reduces boot time by 40 sec. Every bit helps.
Signed-off-by: NJack Steiner <steiner@sgi.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/20110805140900.GA6774@sgi.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

05e33fc2

x86, olpc: Wait for last byte of EC command to be accepted · a3ea14df

由 Paul Fox 提交于 7月 26, 2011

When executing EC commands, only waiting when there are still
more bytes to write is usually fine. However, if the system
suspends very quickly after a call to olpc_ec_cmd(), the last
data byte may not yet be transferred to the EC, and the command
will not complete.

This solves a bug where the SCI wakeup mask was not correctly
written when going into suspend.

It means that sometimes, on XO-1.5 (but not XO-1), the
devices that were marked as wakeup sources can't wake up
the system. e.g. you ask for wifi wakeups, suspend, but then
incoming wifi frames don't wake up the system as they should.
Signed-off-by: NPaul Fox <pgf@laptop.org>
Signed-off-by: NDaniel Drake <dsd@laptop.org>
Acked-by: NAndres Salomon <dilinger@queued.net>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a3ea14df

05 8月, 2011 6 次提交

xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set · c00c8aa2

由 Konrad Rzeszutek Wilk 提交于 8月 04, 2011

with CONFIG_XEN and CONFIG_FTRACE set we get this:

arch/x86/xen/trace.c:22: error: ‘__HYPERVISOR_console_io’ undeclared here (not in a function)
arch/x86/xen/trace.c:22: error: array index in initializer not of integer type
arch/x86/xen/trace.c:22: error: (near initialization for ‘xen_hypercall_names’)
arch/x86/xen/trace.c:23: error: ‘__HYPERVISOR_physdev_op_compat’ undeclared here (not in a function)

Issue was that the definitions of __HYPERVISOR were not pulled
if CONFIG_XEN_PRIVILEGED_GUEST was not set.
Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

c00c8aa2

x86-64: Add vsyscall:emulate_vsyscall trace event · c149a665

由 Andy Lutomirski 提交于 8月 03, 2011

Vsyscall emulation is slow, so make it easy to track down.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/cdaad7da946a80b200df16647c1700db3e1171e9.1312378163.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

c149a665

x86-64: Add user_64bit_mode paravirt op · 318f5a2a

由 Andy Lutomirski 提交于 8月 03, 2011

Three places in the kernel assume that the only long mode CPL 3
selector is __USER_CS.  This is not true on Xen -- Xen's sysretq
changes cs to the magic value 0xe033.

Two of the places are corner cases, but as of "x86-64: Improve
vsyscall emulation CS and RIP handling"
(c9712944), vsyscalls will segfault
if called with Xen's extra CS selector.  This causes a panic when
older init builds die.

It seems impossible to make Xen use __USER_CS reliably without
taking a performance hit on every system call, so this fixes the
tests instead with a new paravirt op.  It's a little ugly because
ptrace.h can't include paravirt.h.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/f4fcb3947340d9e96ce1054a432f183f9da9db83.1312378163.git.luto@mit.eduReported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

318f5a2a

x86-64, xen: Enable the vvar mapping · 5d5791af

由 Andy Lutomirski 提交于 8月 03, 2011

Xen needs to handle VVAR_PAGE, introduced in git commit:
9fd67b4e
x86-64: Give vvars their own page

Otherwise we die during bootup with a message like:

(XEN) mm.c:940:d10 Error getting mfn 1888 (pfn 1e3e48) from L1 entry
      8000000001888465 for l1e_owner=10, pg_owner=10
(XEN) mm.c:5049:d10 ptwr_emulate: could not get_page_from_l1e()
[    0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)
[    0.000000] IP: [<ffffffff8103a930>] xen_set_pte+0x20/0xe0
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/4659478ed2f3480938f96491c2ecbe2b2e113a23.1312378163.git.luto@mit.eduReviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

5d5791af

x86-64: Work around gold bug 13023 · f670bb76

由 Andy Lutomirski 提交于 8月 03, 2011

Gold has trouble assigning numbers to the location counter inside of
an output section description.  The bug was triggered by
9fd67b4e, which consolidated all of
the vsyscall sections into a single section.  The workaround is IMO
still nicer than the old way of doing it.

This produces an apparently valid kernel image and passes my vdso
tests on both GNU ld version 2.21.51.0.6-2.fc15 20110118 and GNU
gold (version 2.21.51.0.6-2.fc15 20110118) 1.10 as distributed by
Fedora 15.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/0b260cb806f1f9a25c00ce8377a5f035d57f557a.1312378163.git.luto@mit.eduReported-by: NArkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

f670bb76

x86-64: Move the "user" vsyscall segment out of the data segment. · 9c40818d

由 Andy Lutomirski 提交于 8月 03, 2011

The kernel's loader doesn't seem to care, but gold complains.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/f0716870c297242a841b949953d80c0d87bf3d3f.1312378163.git.luto@mit.eduReported-by: NArkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9c40818d