提交 · 09ec54429c6d10f87d1f084de53ae2c1c3a81108 · openanolis / cloud-kernel

24 7月, 2014 4 次提交

clocksource: Move cycle_last validation to core code · 09ec5442

由 Thomas Gleixner 提交于 7月 16, 2014

The only user of the cycle_last validation is the x86 TSC. In order to
provide NMI safe accessor functions for clock monotonic and
monotonic_raw we need to do that in the core.

We can't do the TSC specific

    if (now < cycle_last)
       	    now = cycle_last;

for the other wrapping around clocksources, but TSC has
CLOCKSOURCE_MASK(64) which actually does not mask out anything so if
now is less than cycle_last the subtraction will give a negative
result. So we can check for that in clocksource_delta() and return 0
for that case.

Implement and enable it for x86
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

09ec5442

x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based · cbcf2dd3

由 Thomas Gleixner 提交于 7月 16, 2014

Convert the relevant base data right away to nanoseconds instead of
doing the conversion on every readout. Reduces text size by 160 bytes.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

cbcf2dd3

x86: kvm: Use ktime_get_boot_ns() · bb0b5812

由 Thomas Gleixner 提交于 7月 16, 2014

Use the new nanoseconds based interface and get rid of the timespec
conversion dance.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

bb0b5812

ktime: Kill non-scalar ktime_t implementation for 2038 · 24e4a8c3

由 John Stultz 提交于 7月 16, 2014

The non-scalar ktime_t implementation is basically a timespec
which has to be changed to support dates past 2038 on 32bit
systems.

This patch removes the non-scalar ktime_t implementation, forcing
the scalar s64 nanosecond version on all architectures.

This may have additional performance overhead on some 32bit
systems when converting between ktime_t and timespec structures,
however the majority of 32bit systems (arm and i386) were already
using scalar ktime_t, so no performance regressions will be seen
on those platforms.

On affected platforms, I'm open to finding optimizations, including
avoiding converting to timespecs where possible.

[ tglx: We can now cleanup the ktime_t.tv64 mess, but thats a
  different issue and we can throw a coccinelle script at it ]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

24e4a8c3

11 7月, 2014 2 次提交

x86-32, vdso: Fix vDSO build error due to missing align_vdso_addr() · d093601b

由 Jan Beulich 提交于 7月 03, 2014

Relying on static functions used just once to get inlined (and
subsequently have dead code paths eliminated) is wrong: Compilers are
free to decide whether they do this, regardless of optimization level.
With this not happening for vdso_addr() (observed with gcc 4.1.x), an
unresolved reference to align_vdso_addr() causes the build to fail.

[ hpa: vdso_addr() is never actually used on x86-32, as calculate_addr
  in map_vdso() is always false.  It ought to be possible to clean
  this up further, but this fixes the immediate problem. ]
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/53B5863B02000078000204D5@mail.emea.novell.comAcked-by: NAndy Lutomirski <luto@amacapital.net>
Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

d093601b

x86-64, vdso: Fix vDSO build breakage due to empty .rela.dyn · 9f88b906

由 Jan Beulich 提交于 7月 03, 2014

Certain ld versions (observed with 2.20.0) put an empty .rela.dyn
section into shared object files, breaking the assumption on the number
of sections to be copied to the final output. Simply discard any empty
SHT_REL and SHT_RELA sections to address this.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/53B5861E02000078000204D1@mail.emea.novell.comAcked-by: NAndy Lutomirski <luto@amacapital.net>
Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9f88b906

04 7月, 2014 1 次提交

ptrace,x86: force IRET path after a ptrace_stop() · b9cd18de

由 Tejun Heo 提交于 7月 03, 2014

The 'sysret' fastpath does not correctly restore even all regular
registers, much less any segment registers or reflags values.  That is
very much part of why it's faster than 'iret'.

Normally that isn't a problem, because the normal ptrace() interface
catches the process using the signal handler infrastructure, which
always returns with an iret.

However, some paths can get caught using ptrace_event() instead of the
signal path, and for those we need to make sure that we aren't going to
return to user space using 'sysret'.  Otherwise the modifications that
may have been done to the register set by the tracer wouldn't
necessarily take effect.

Fix it by forcing IRET path by setting TIF_NOTIFY_RESUME from
arch_ptrace_stop_needed() which is invoked from ptrace_stop().
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NAndy Lutomirski <luto@amacapital.net>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b9cd18de

30 6月, 2014 1 次提交

KVM: SVM: Fix CPL export via SS.DPL · 33b458d2

由 Jan Kiszka 提交于 6月 29, 2014

We import the CPL via SS.DPL since ae9fedc7. However, we fail to
export it this way so far. This caused spurious guest crashes, e.g. of
Linux when accessing the vmport from guest user space which triggered
register saving/restoring to/from host user space.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

33b458d2

25 6月, 2014 3 次提交

crypto: sha512_ssse3 - fix byte count to bit count conversion · cfe82d4f

由 Jussi Kivilinna 提交于 6月 23, 2014

Byte-to-bit-count computation is only partly converted to big-endian and is
mixing in CPU-endian values. Problem was noticed by sparce with warning:

CHECK arch/x86/crypto/sha512_ssse3_glue.c
arch/x86/crypto/sha512_ssse3_glue.c:144:19: warning: restricted __be64 degrades to integer
arch/x86/crypto/sha512_ssse3_glue.c:144:17: warning: incorrect type in assignment (different base types)
arch/x86/crypto/sha512_ssse3_glue.c:144:17: expected restricted __be64 <noident>
arch/x86/crypto/sha512_ssse3_glue.c:144:17: got unsigned long long

Cc: <stable@vger.kernel.org>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@iki.fi>
Acked-by: NTim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

cfe82d4f

x86/vdso: Error out in vdso2c if DT_RELA is present · 6a89d710

由 Andy Lutomirski 提交于 6月 24, 2014

vdso2c was checking for various types of relocations to detect when
the vdso had undefined symbols or was otherwise dependent on
relocation at load time.  Undefined symbols in the vdso would fail if
accessed at runtime, and certain implementation errors (e.g. branch
profiling or incorrect symbol visibilities) could result in data
access through the GOT that requires relocations.  This could be
as simple as:

    extern char foo;
    return foo;

Without some kind of visibility control, the compiler would assume
that foo could be interposed at load time and would generate a
relocation.

x86-64 and x32 (as opposed to i386) use explicit-addent (RELA) instead
of implicit-addent (REL) relocations for data access, and vdso2c
forgot to detect those.

Whether these bad relocations would actually fail at runtime depends
on what the linker sticks in the unrelocated references.  Nonetheless,
these relocations have no business existing in the vDSO and should be
fixed rather than silently ignored.

This error could trigger on some configurations due to branch
profiling.  The previous patch fixed that.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/74ef0c00b4d2a3b573e00a4113874e62f772e348.1403642755.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

6a89d710

x86/vdso: Move DISABLE_BRANCH_PROFILING into the vdso makefile · 46b57a76

由 Andy Lutomirski 提交于 6月 24, 2014

DISABLE_BRANCH_PROFILING turns off branch profiling (i.e. a
redefinition of 'if'). Branch profiling depends on a bunch of
kernel-internal symbols and generates extra output sections, none of
which are useful or functional in the vDSO.

It's currently turned off for vclock_gettime.c, but vgetcpu.c also
triggers branch profiling, so just turn it off in the makefile.

This fixes the build on some configurations: the vdso could contain
undefined symbols, and the fake section table overflowed due to
ftrace's added sections.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/bf1ec29e03b2bbc081f6dcaefa64db1c3a83fb21.1403642755.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

46b57a76

24 6月, 2014 3 次提交

nmi: provide the option to issue an NMI back trace to every cpu but current · f3aca3d0

由 Aaron Tomlin 提交于 6月 23, 2014

Sometimes it is preferred not to use the trigger_all_cpu_backtrace()
routine when one wants to avoid capturing a back trace for current.  For
instance if one was previously captured recently.

This patch provides a new routine namely
trigger_allbutself_cpu_backtrace() which offers the flexibility to issue
an NMI to every cpu but current and capture a back trace accordingly.

Patch x86 and sparc to support new routine.

[dzickus@redhat.com: add stub in #else clause]
[dzickus@redhat.com: don't print message in single processor case, wrap with get/put_cpu based on Oleg's suggestion]
[sfr@canb.auug.org.au: undo C99ism]
Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f3aca3d0

x86_32, signal: Fix vdso rt_sigreturn · 6ba19a67

由 Andy Lutomirski 提交于 6月 21, 2014

This commit:

    commit 6f121e54
    Author: Andy Lutomirski <luto@amacapital.net>
    Date:   Mon May 5 12:19:34 2014 -0700

        x86, vdso: Reimplement vdso.so preparation in build-time C

Contained this obvious typo:

-               restorer = VDSO32_SYMBOL(current->mm->context.vdso, rt_sigreturn);
+               restorer = current->mm->context.vdso +
+                       selected_vdso32->sym___kernel_sigreturn;

Note the missing 'rt_' in the new code.  Fix it.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/1eb40ad923acde2e18357ef2832867432e70ac42.1403361010.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

6ba19a67

x86_32, entry: Do syscall exit work on badsys (CVE-2014-4508) · 554086d8

由 Andy Lutomirski 提交于 6月 23, 2014

The bad syscall nr paths are their own incomprehensible route
through the entry control flow.  Rearrange them to work just like
syscalls that return -ENOSYS.

This fixes an OOPS in the audit code when fast-path auditing is
enabled and sysenter gets a bad syscall nr (CVE-2014-4508).

This has probably been broken since Linux 2.6.27:
af0575bb i386 syscall audit fast-path

Cc: stable@vger.kernel.org
Cc: Roland McGrath <roland@redhat.com>
Reported-by: NToralf Förster <toralf.foerster@gmx.de>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/e09c499eade6fc321266dd6b54da7beb28d6991c.1403558229.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

554086d8

21 6月, 2014 1 次提交

x86/vdso: Create .build-id links for unstripped vdso files · dda1e95c

由 Andy Lutomirski 提交于 6月 20, 2014

With this change, doing 'make vdso_install' and telling gdb:

set debug-file-directory /lib/modules/KVER/vdso

will enable vdso debugging with symbols.  This is useful for
testing, but kernel RPM builds will probably want to manually delete
these symlinks or otherwise do something sensible when they strip
the vdso/*.so files.

If ld does not support --build-id, then the symlinks will not be
created.

Note that kernel packagers that use vdso_install may need to adjust
their packaging scripts to accomdate this change.  For example,
Fedora's scripts create build-id symlinks themselves in a different
location, so the spec should probably be updated to remove the
symlinks created by make vdso_install.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/a424b189ce3ced85fe1e82d032a20e765e0fe0d3.1403291930.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

dda1e95c

20 6月, 2014 4 次提交

x86/vdso: Remove some redundant in-memory section headers · 0e3727a8

由 Andy Lutomirski 提交于 6月 18, 2014

.data doesn't need to be separate from .rodata: they're both readonly.

.altinstructions and .altinstr_replacement aren't needed by anything
except vdso2c; strip them from the final image.

While we're at it, rather than aligning the actual executable text,
just shove some unused-at-runtime data in between real data and
text.

My vdso image is still above 4k, but I'm disinclined to try to
trim it harder for 3.16.  For future trimming, I suspect that these
sections could be moved to later in the file and dropped from
the in-memory image:

.gnu.version and .gnu.version_d   (this may lose versions in gdb)
.eh_frame                         (should be harmless)
.eh_frame_hdr                     (I'm not really sure)
.hash                             (AFAIK nothing needs this section header)
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/2e96d0c49016ea6d026a614ae645e93edd325961.1403129369.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

0e3727a8

x86/vdso: Improve the fake section headers · bfad381c

由 Andy Lutomirski 提交于 6月 18, 2014

Fully stripping the vDSO has other unfortunate side effects:

 - binutils is unable to find ELF notes without a SHT_NOTE section.

 - Even elfutils has trouble: it can find ELF notes without a section
   table at all, but if a section table is present, it won't look for
   PT_NOTE.

 - gdb wants section names to match between stripped DSOs and their
   symbols; otherwise it will corrupt symbol addresses.

We're also breaking the rules: section 0 is supposed to be SHT_NULL.

Fix these problems by building a better fake section table.  While
we're at it, we might as well let buggy Go versions keep working well
by giving the SHT_DYNSYM entry the correct size.

This is a bit unfortunate: it adds quite a bit of size to the vdso
image.

If/when binutils improves and the improved versions become widespread,
it would be worth considering dropping most of this.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/0e546a5eeaafdf1840e6ee654a55c1e727c26663.1403129369.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

bfad381c

x86/vdso2c: Use better macros for ELF bitness · c1979c37

由 Andy Lutomirski 提交于 6月 18, 2014

Rather than using a separate macro for each replacement, use generic
macros.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/d953cd2e70ceee1400985d091188cdd65fba2f05.1403129369.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

c1979c37

x86/vdso: Discard the __bug_table section · 5f56e716

由 Andy Lutomirski 提交于 6月 18, 2014

It serves no purpose in user code.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/2a5bebff42defd8a5e81d96f7dc00f21143c80e8.1403129369.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

5f56e716

19 6月, 2014 3 次提交

KVM: x86: preserve the high 32-bits of the PAT register · 7cb060a9

由 Paolo Bonzini 提交于 6月 19, 2014

KVM does not really do much with the PAT, so this went unnoticed for a
long time.  It is exposed however if you try to do rdmsr on the PAT
register.
Reported-by: NValentine Sinitsyn <valentine.sinitsyn@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7cb060a9

kvm: fix wrong address when writing Hyper-V tsc page · e1fa108d

由 Xiaoming Gao 提交于 6月 19, 2014

When kvm_write_guest writes the tsc_ref structure to the guest, or it will lead
the low HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT bits of the TSC page address
must be cleared, or the guest can see a non-zero sequence number.

Otherwise Windows guests would not be able to get a correct clocksource
(QueryPerformanceCounter will always return 0) which causes serious chaos.
Signed-off-by: NXiaoming Gao <newtongao@tencnet.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e1fa108d

KVM: x86: Increase the number of fixed MTRR regs to 10 · 682367c4

由 Nadav Amit 提交于 6月 18, 2014

Recent Intel CPUs have 10 variable range MTRRs. Since operating systems
sometime make assumptions on CPUs while they ignore capability MSRs, it is
better for KVM to be consistent with recent CPUs. Reporting more MTRRs than
actually supported has no functional implications.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

682367c4

18 6月, 2014 1 次提交

x86/xen: no need to explicitly register an NMI callback · ea9f9274

由 David Vrabel 提交于 6月 16, 2014

Remove xen_enable_nmi() to fix a 64-bit guest crash when registering
the NMI callback on Xen 3.1 and earlier.

It's not needed since the NMI callback is set by a set_trap_table
hypercall (in xen_load_idt() or xen_write_idt_entry()).

It's also broken since it only set the current VCPU's callback.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Reported-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: NVitaly Kuznetsov <vkuznets@redhat.com>

ea9f9274

17 6月, 2014 1 次提交

x86, kaslr: boot-time selectable with hibernation · 24f2e027

由 Kees Cook 提交于 6月 13, 2014

Changes kASLR from being compile-time selectable (blocked by
CONFIG_HIBERNATION), to being boot-time selectable (with hibernation
available by default) via the "kaslr" kernel command line.
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

24f2e027

14 6月, 2014 2 次提交

x86/kprobes: Fix build errors and blacklist context_track_user · 4cdf77a8

由 Masami Hiramatsu 提交于 6月 14, 2014

This essentially reverts commit:

  ecd50f71 ("kprobes, x86: Call exception_enter after kprobes handled")

since it causes build errors with CONFIG_CONTEXT_TRACKING and
that has been made from misunderstandings;
context_track_user_*() don't involve much in interrupt context,
it just returns if in_interrupt() is true.

Instead of changing the do_debug/int3(), this just adds
context_track_user_*() to kprobes blacklist, since those are
still can be called right before kprobes handles int3 and debug
exceptions, and probing those will cause an infinite loop.
Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20140614064711.7865.45957.stgit@kbuild-fedora.novalocalSigned-off-by: NIngo Molnar <mingo@kernel.org>

4cdf77a8

x86/vdso: Fix vdso_install · a934fb5b

由 Andy Lutomirski 提交于 6月 12, 2014

"make vdso_install" installs unstripped versions of the vdso objects
for the benefit of the debugger. This was broken by checkin:

6f121e54 x86, vdso: Reimplement vdso.so preparation in build-time C

The filenames are different now, so update the Makefile to cope.

This still installs the 64-bit vdso as vdso64.so. We believe this
will be okay, as the only known user is a patched gdb which is known
to use build-ids, but if it turns out to be a problem we may have to
add a link.

Inspired by a patch from Sam Ravnborg.
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Reported-by: NJosh Boyer <jwboyer@fedoraproject.org>
Tested-by: NJosh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/b10299edd8ba98d17e07dafcd895b8ecf4d99eff.1402586707.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

a934fb5b

13 6月, 2014 2 次提交

x86/vdso: Hack to keep 64-bit Go programs working · e0bf7b86

由 Andy Lutomirski 提交于 6月 12, 2014

The Go runtime has a buggy vDSO parser that currently segfaults.
This writes an empty SHT_DYNSYM entry that causes Go's runtime to
malfunction by thinking that the vDSO is empty rather than
malfunctioning by running off the end and segfaulting.

This affects x86-64 only as far as we know, so we do not need this for
the i386 and x32 vdsos.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/d10618176c4bd39b457a5e85c497295c90cab1bc.1402620737.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

e0bf7b86

x86/vdso: Add PUT_LE to store little-endian values · b4b31f61

由 Andy Lutomirski 提交于 6月 12, 2014

Add PUT_LE() by analogy with GET_LE() to write littleendian values in
addition to reading them.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/3d9b27e92745b27b6fda1b9a98f70dc9c1246c7a.1402620737.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

b4b31f61

11 6月, 2014 3 次提交

net: filter: cleanup A/X name usage · e430f34e

由 Alexei Starovoitov 提交于 6月 06, 2014

The macro 'A' used in internal BPF interpreter:
 #define A regs[insn->a_reg]
was easily confused with the name of classic BPF register 'A', since
'A' would mean two different things depending on context.

This patch is trying to clean up the naming and clarify its usage in the
following way:

- A and X are names of two classic BPF registers

- BPF_REG_A denotes internal BPF register R0 used to map classic register A
  in internal BPF programs generated from classic

- BPF_REG_X denotes internal BPF register R7 used to map classic register X
  in internal BPF programs generated from classic

- internal BPF instruction format:
struct sock_filter_int {
        __u8    code;           /* opcode */
        __u8    dst_reg:4;      /* dest register */
        __u8    src_reg:4;      /* source register */
        __s16   off;            /* signed offset */
        __s32   imm;            /* signed immediate constant */
};

- BPF_X/BPF_K is 1 bit used to encode source operand of instruction
In classic:
  BPF_X - means use register X as source operand
  BPF_K - means use 32-bit immediate as source operand
In internal:
  BPF_X - means use 'src_reg' register as source operand
  BPF_K - means use 32-bit immediate as source operand
Suggested-by: NChema Gonzalez <chema@google.com>
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: NChema Gonzalez <chema@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e430f34e

x86, vdso: Remove one final use of htole16() · 4d048b02

由 H. Peter Anvin 提交于 6月 10, 2014

One final use of the macros from <endian.h> which are not available on
older system.  In this case we had one sole case of *writing* a
littleendian number, but the number is SHN_UNDEF which is the constant
zero, so rather than dealing with the general case of littleendian
puts here, just document that the constant is zero and be done with
it.
Reported-and-Tested-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/20140610135051.c3c34165f73d67d218b62bd9@linux-foundation.org

4d048b02

x86: intel-mid: add watchdog platform code for Merrifield · 78a3bb9e

由 David Cohen 提交于 4月 15, 2014

This patch adds platform code for Intel Merrifield.
Since the watchdog is not part of SFI table, we have no other option but
to manually register watchdog's platform device (argh!).
Signed-off-by: NDavid Cohen <david.a.cohen@linux.intel.com>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

78a3bb9e

09 6月, 2014 1 次提交

Revert "x86/smpboot: Initialize secondary CPU only if master CPU will wait for it" · bb077d60

由 Linus Torvalds 提交于 6月 08, 2014

This reverts commit 3e1a878b.

It came in very late, and already has one reported failure: Sitsofe
reports that the current tree fails to boot on his EeePC, and bisected
it down to this.  Rather than waste time trying to figure out what's
wrong, just revert it.
Reported-by: NSitsofe Wheeler <sitsofe@gmail.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bb077d60

08 6月, 2014 1 次提交

x86/boot: EFI_MIXED should not prohibit loading above 4G · 745c5167

由 Matt Fleming 提交于 6月 07, 2014

commit 7d453eee ("x86/efi: Wire up CONFIG_EFI_MIXED") introduced a
regression for the functionality to load kernels above 4G. The relevant
(incorrect) reasoning behind this change can be seen in the commit
message,

  "The xloadflags field in the bzImage header is also updated to reflect
  that the kernel supports both entry points by setting both of
  XLF_EFI_HANDOVER_32 and XLF_EFI_HANDOVER_64 when CONFIG_EFI_MIXED=y.
  XLF_CAN_BE_LOADED_ABOVE_4G is disabled so that the kernel text is
  guaranteed to be addressable with 32-bits."

This is obviously bogus since 32-bit EFI loaders will never place the
kernel above the 4G mark. So this restriction is entirely unnecessary.

But things are worse than that - since we want to encourage people to
always compile with CONFIG_EFI_MIXED=y so that their kernels work out of
the box for both 32-bit and 64-bit firmware, commit 7d453eee
effectively disables XLF_CAN_BE_LOADED_ABOVE_4G completely.

Remove the overzealous and superfluous restriction and restore the
XLF_CAN_BE_LOADED_ABOVE_4G functionality.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1402140380-15377-1-git-send-email-matt@console-pimps.orgSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

745c5167

07 6月, 2014 2 次提交

signals: kill sigfindinword() · 36fac0a2

由 Oleg Nesterov 提交于 6月 06, 2014

It has no users and it doesn't look useful.  I do not know why/when it was
introduced, I can't even find any user in the git history.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

36fac0a2

x86, vdso: Use <tools/le_byteshift.h> for littleendian access · bdfb9bcc

由 H. Peter Anvin 提交于 6月 06, 2014

There are no standard functions for littleendian data (unlike
bigendian data.)  Thus, use <tools/le_byteshift.h> to access
littleendian data members.  Those are fairly inefficient, but it
doesn't matter for this purpose (and can be optimized later.)  This
avoids portability problems.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Tested-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/20140606140017.afb7f91142f66cb3dd13c186@linux-foundation.org

bdfb9bcc

06 6月, 2014 1 次提交

x86, locking/rwlocks: Enable qrwlocks on x86 · bd01ec1a

由 Waiman Long 提交于 2月 03, 2014

Make x86 use the fair rwlock_t.

Implement the custom queue_write_unlock() for best performance.
Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
[peterz: near complete rewrite]
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-r1xuzmdysvuhl3h86n5fbxi7@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

bd01ec1a

05 6月, 2014 4 次提交

x86/smpboot: Initialize secondary CPU only if master CPU will wait for it · 3e1a878b

由 Igor Mammedov 提交于 6月 05, 2014

Hang is observed on virtual machines during CPU hotplug,
especially in big guests with many CPUs. (It reproducible
more often if host is over-committed).

It happens because master CPU gives up waiting on
secondary CPU and allows it to run wild. As result
AP causes locking or crashing system. For example
as described here:

   https://lkml.org/lkml/2014/3/6/257

If master CPU have sent STARTUP IPI successfully,
and AP signalled to master CPU that it's ready
to start initialization, make master CPU wait
indefinitely till AP is onlined.
To ensure that AP won't ever run wild, make it
wait at early startup till master CPU confirms its
intention to wait for AP. If AP doesn't respond in 10
seconds, the master CPU will timeout and cancel
AP onlining.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NToshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-4-git-send-email-imammedo@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

3e1a878b

x86/smpboot: Log error on secondary CPU wakeup failure at ERR level · feef1e8e

由 Igor Mammedov 提交于 6月 05, 2014

If system is running without debug level logging,
it will not log error if do_boot_cpu() failed to
wakeup AP. It may lead to silent AP bringup
failures at boot time.
Change message level to KERN_ERR to make error
visible to user as it's done on other architectures.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NToshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-3-git-send-email-imammedo@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

feef1e8e

x86: Fix list/memory corruption on CPU hotplug · 89f898c1

由 Igor Mammedov 提交于 6月 05, 2014

currently if AP wake up is failed, master CPU marks AP as not
present in do_boot_cpu() by calling set_cpu_present(cpu, false).
That leads to following list corruption on the next physical CPU
hotplug:

[  418.107336] WARNING: CPU: 1 PID: 45 at lib/list_debug.c:33 __list_add+0xbe/0xd0()
[  418.115268] list_add corruption. prev->next should be next (ffff88003dc57600), but was ffff88003e20c3a0. (prev=ffff88003e20c3a0).
[  418.123693] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipt_REJECT cfg80211 xt_conntrack rfkill ee
[  418.138979] CPU: 1 PID: 45 Comm: kworker/u10:1 Not tainted 3.14.0-rc6+ #387
[  418.149989] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[  418.165750] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  418.166433]  0000000000000021 ffff880038ca7988 ffffffff8159b22d 0000000000000021
[  418.176460]  ffff880038ca79d8 ffff880038ca79c8 ffffffff8106942c ffff880038ca79e8
[  418.177453]  ffff88003e20c3a0 ffff88003dc57600 ffff88003e20c3a0 00000000ffffffea
[  418.178445] Call Trace:
[  418.185811]  [<ffffffff8159b22d>] dump_stack+0x49/0x5c
[  418.186440]  [<ffffffff8106942c>] warn_slowpath_common+0x8c/0xc0
[  418.187192]  [<ffffffff81069516>] warn_slowpath_fmt+0x46/0x50
[  418.191231]  [<ffffffff8136ef51>] ? acpi_ns_get_node+0xb7/0xc7
[  418.193889]  [<ffffffff812f796e>] __list_add+0xbe/0xd0
[  418.196649]  [<ffffffff812e2aa9>] kobject_add_internal+0x79/0x200
[  418.208610]  [<ffffffff812e2e18>] kobject_add_varg+0x38/0x60
[  418.213831]  [<ffffffff812e2ef4>] kobject_add+0x44/0x70
[  418.229961]  [<ffffffff813e2c60>] device_add+0xd0/0x550
[  418.234991]  [<ffffffff813f0e95>] ? pm_runtime_init+0xe5/0xf0
[  418.250226]  [<ffffffff813e32be>] device_register+0x1e/0x30
[  418.255296]  [<ffffffff813e82a3>] register_cpu+0xe3/0x130
[  418.266539]  [<ffffffff81592be5>] arch_register_cpu+0x65/0x150
[  418.285845]  [<ffffffff81355c0d>] acpi_processor_hotadd_init+0x5a/0x9b
...
Which is caused by the fact that generic_processor_info() allocates
logical CPU id by calling:

 cpu = cpumask_next_zero(-1, cpu_present_mask);

which returns id of previously failed to wake up CPU, since its
bit is cleared by do_boot_cpu() and as result register_cpu()
tries to register another CPU with the same id as already
present but failed to be onlined CPU.

Taking in account that AP will not do anything if master CPU
failed to wake it up, there is no reason to mark that AP as not
present and break next cpu hotplug attempts. As a side effect of
not marking AP as not present, user would be allowed to online
it again later.

Also fix memory corruption in acpi_unmap_lsapic()

if during CPU hotplug master CPU failed to wake up AP
it set percpu x86_cpu_to_apicid to BAD_APICID=0xFFFF for AP.

However following attempt to unplug that CPU will lead to
out of bound write access to __apicid_to_node[] which is
32768 items long on x86_64 kernel.

So with above fix of cpu_present_mask make sure that a present
CPU has a valid APIC ID by not setting x86_cpu_to_apicid
to BAD_APICID in do_boot_cpu() on failure and allow
acpi_processor_remove()->acpi_unmap_lsapic() cleanly remove CPU.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NToshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-2-git-send-email-imammedo@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

89f898c1

uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates · 5cdb76d6

由 Oleg Nesterov 提交于 6月 01, 2014

Purely cosmetic, no changes in .o,

1. As Jim pointed out arch_uprobe->def looks ambiguous, rename it to
   ->defparam.

2. Add the comment into default_post_xol_op() to explain "regs->sp +=".

3. Remove the stale part of the comment in arch_uprobe_analyze_insn().
Suggested-by: NJim Keniston <jkenisto@us.ibm.com>
Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>

5cdb76d6

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功