提交 · bd7dc5a6afac719d8ce4092391eef2c7e83c2a75 · openanolis / cloud-kernel

02 11月, 2017 17 次提交

x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0() · bd7dc5a6

由 Andy Lutomirski 提交于 11月 02, 2017

This causes the MSR_IA32_SYSENTER_CS write to move out of the
paravirt callback.  This shouldn't affect Xen PV: Xen already ignores
MSR_IA32_SYSENTER_ESP writes.  In any event, Xen doesn't support
vm86() in a useful way.

Note to any potential backporters: This patch won't break lguest, as
lguest didn't have any SYSENTER support at all.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/75cf09fe03ae778532d0ca6c65aa58e66bc2f90c.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

bd7dc5a6

x86/entry/64: De-Xen-ify our NMI code · 929bacec

由 Andy Lutomirski 提交于 11月 02, 2017

Xen PV is fundamentally incompatible with our fancy NMI code: it
doesn't use IST at all, and Xen entries clobber two stack slots
below the hardware frame.

Drop Xen PV support from our NMI code entirely.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Acked-by: NJuergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bfbe711b5ae03f672f8848999a8eb2711efc7f98.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

929bacec

xen, x86/entry/64: Add xen NMI trap entry · 43e41110

由 Juergen Gross 提交于 11月 02, 2017

Instead of trying to execute any NMI via the bare metal's NMI trap
handler use a Xen specific one for PV domains, like we do for e.g.
debug traps. As in a PV domain the NMI is handled via the normal
kernel stack this is the correct thing to do.

This will enable us to get rid of the very fragile and questionable
dependencies between the bare metal NMI handler and Xen assumptions
believed to be broken anyway.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5baf5c0528d58402441550c5770b98e7961e7680.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

43e41110

x86/entry/64: Remove the RESTORE_..._REGS infrastructure · c39858de

由 Andy Lutomirski 提交于 11月 02, 2017

All users of RESTORE_EXTRA_REGS, RESTORE_C_REGS and such, and
REMOVE_PT_GPREGS_FROM_STACK are gone.  Delete the macros.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c32672f6e47c561893316d48e06c7656b1039a36.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

c39858de

x86/entry/64: Use POP instead of MOV to restore regs on NMI return · 471ee483

由 Andy Lutomirski 提交于 11月 02, 2017

This gets rid of the last user of the old RESTORE_..._REGS infrastructure.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/652a260f17a160789bc6a41d997f98249b73e2ab.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

471ee483

x86/entry/64: Merge the fast and slow SYSRET paths · a5122106

由 Andy Lutomirski 提交于 11月 02, 2017

They did almost the same thing.  Remove a bunch of pointless
instructions (mostly hidden in macros) and reduce cognitive load by
merging them.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1204e20233fcab9130a1ba80b3b1879b5db3fc1f.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

a5122106

x86/entry/64: Use pop instead of movq in syscall_return_via_sysret · 4fbb3910

由 Andy Lutomirski 提交于 11月 02, 2017

Saves 64 bytes.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6609b7f74ab31c36604ad746e019ea8495aec76c.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

4fbb3910

x86/entry/64: Shrink paranoid_exit_restore and make labels local · e5317832

由 Andy Lutomirski 提交于 11月 02, 2017

paranoid_exit_restore was a copy of restore_regs_and_return_to_kernel.
Merge them and make the paranoid_exit internal labels local.

Keeping .Lparanoid_exit makes the code a bit shorter because it
allows a 2-byte jnz instead of a 5-byte jnz.

Saves 96 bytes of text.

( This is still a bit suboptimal in a non-CONFIG_TRACE_IRQFLAGS
  kernel, but fixing that would make the code rather messy. )
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/510d66a1895cda9473c84b1086f0bb974f22de6a.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

e5317832

x86/entry/64: Simplify reg restore code in the standard IRET paths · e872045b

由 Andy Lutomirski 提交于 11月 02, 2017

The old code restored all the registers with movq instead of pop.

In theory, this was done because some CPUs have higher movq
throughput, but any gain there would be tiny and is almost certainly
outweighed by the higher text size.

This saves 96 bytes of text.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ad82520a207ccd851b04ba613f4f752b33ac05f7.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

e872045b

x86/entry/64: Move SWAPGS into the common IRET-to-usermode path · 8a055d7f

由 Andy Lutomirski 提交于 11月 02, 2017

All of the code paths that ended up doing IRET to usermode did
SWAPGS immediately beforehand.  Move the SWAPGS into the common
code.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/27fd6f45b7cd640de38fb9066fd0349bcd11f8e1.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

8a055d7f

x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths · 26c4ef9c

由 Andy Lutomirski 提交于 11月 02, 2017

These code paths will diverge soon.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/dccf8c7b3750199b4b30383c812d4e2931811509.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

26c4ef9c

x86/entry/64: Remove the restore_c_regs_and_iret label · 9da78ba6

由 Andy Lutomirski 提交于 11月 02, 2017

The only user was the 64-bit opportunistic SYSRET failure path, and
that path didn't really need it.  This change makes the
opportunistic SYSRET code a bit more straightforward and gets rid of
the label.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/be3006a7ad3326e3458cf1cc55d416252cbe1986.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

9da78ba6

Merge branch 'x86/fpu' into x86/asm · 50da9d43

由 Ingo Molnar 提交于 11月 02, 2017

We are about to commit complex rework of various x86 entry code details - create
a unified base tree (with FPU commits included) before doing that.
Signed-off-by: NIngo Molnar <mingo@kernel.org>

50da9d43

Merge branch 'x86/mpx/prep' into x86/asm · 3357b0d3

由 Ingo Molnar 提交于 11月 02, 2017

Pick up some of the MPX commits that modify the syscall entry code,
to have a common base and to reduce conflicts.
Signed-off-by: NIngo Molnar <mingo@kernel.org>

3357b0d3

ptrace,x86: Make user_64bit_mode() available to 32-bit builds · e27c310a

由 Ricardo Neri 提交于 10月 27, 2017

In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
is selected. This implies that code built with CONFIG_X86_64=n cannot use
it. If a piece of code needs to be built for both CONFIG_X86_64=y and
CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
an #ifdef/#endif; potentially, in multiple places.

This can be easily avoided with a single #ifdef/#endif pair within
user_64bit_mode() itself.
Suggested-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-4-git-send-email-ricardo.neri-calderon@linux.intel.com

e27c310a

x86/boot: Relocate definition of the initial state of CR0 · b0ce5b8c

由 Ricardo Neri 提交于 10月 27, 2017

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.
Suggested-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: linux-mm@kvack.org
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-arch@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-3-git-send-email-ricardo.neri-calderon@linux.intel.com

b0ce5b8c

x86/mm: Relocate page fault error codes to traps.h · 1067f030

由 Ricardo Neri 提交于 10月 27, 2017

Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.
Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/1509135945-13762-2-git-send-email-ricardo.neri-calderon@linux.intel.com

1067f030

01 11月, 2017 1 次提交

RDMA/nldev: Enforce device index check for port callback · 287683d0

由 Leon Romanovsky 提交于 10月 31, 2017

IB device index is nldev's handler and it should be checked always.

Fixes: c3f66f7b ("RDMA/netlink: Implement nldev port doit callback")
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
[ Applying directly, since Doug fried his SSD's and is rebuilding  - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

287683d0

31 10月, 2017 5 次提交

x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features · c128dbfa

由 Gayatri Kammela 提交于 10月 30, 2017

Add a few new SSE/AVX/AVX512 instruction groups/features for enumeration
in /proc/cpuinfo: AVX512_VBMI2, GFNI, VAES, VPCLMULQDQ, AVX512_VNNI,
AVX512_BITALG.

 CPUID.(EAX=7,ECX=0):ECX[bit 6]  AVX512_VBMI2
 CPUID.(EAX=7,ECX=0):ECX[bit 8]  GFNI
 CPUID.(EAX=7,ECX=0):ECX[bit 9]  VAES
 CPUID.(EAX=7,ECX=0):ECX[bit 10] VPCLMULQDQ
 CPUID.(EAX=7,ECX=0):ECX[bit 11] AVX512_VNNI
 CPUID.(EAX=7,ECX=0):ECX[bit 12] AVX512_BITALG

Detailed information of CPUID bits for these features can be found
in the Intel Architecture Instruction Set Extensions and Future Features
Programming Interface document (refer to Table 1-1. and Table 1-2.).
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=197239Signed-off-by: NGayatri Kammela <gayatri.kammela@intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Yang Zhong <yang.zhong@intel.com>
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/1509412829-23380-1-git-send-email-gayatri.kammela@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c128dbfa

Merge tag 'pm-urgent-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 5f479447

由 Linus Torvalds 提交于 10月 30, 2017

Pull power management fix from Rafael Wysocki:
 "This fixes new breakage introduced by the most recent PM QoS fix in
  which, embarrassingly enough, I forgot to update
  dev_pm_qos_raw_read_value() to return the right default for devices
  with no PM QoS constraints at all which prevents runtime PM from
  suspending those devices (fix from Tero Kristo)"

* tag 'pm-urgent-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / QoS: Fix default runtime_pm device resume latency

5f479447

Mark 'ioremap_page_range()' as possibly sleeping · b39ab98e

由 Linus Torvalds 提交于 10月 30, 2017

It turns out that some drivers seem to think it's ok to remap page
ranges from within interrupts and even NMI's.  That is definitely not
the case, since the page table build-up is simply not interrupt-safe.

This showed up in the zero-day robot that reported it for the ACPI APEI
GHES ("Generic Hardware Error Source") driver.  Normally it had been
hidden by the fact that no page table operations had been needed because
the vmalloc area had been set up by other things.

Apparently due to a recent change to the GHEI driver: commit
77b246b3 ("acpi: apei: check for pending errors when probing GHES
entries") 0day actually caught a case during bootup whenthe ioremap
called down to page allocation.  But that recent change only showed the
symptom, it wasn't the root cause of the problem.

Hopefully it is limited to just that one driver.

If you need to access random physical memory, you either need to ioremap
in process context, or you need to use the FIXMAP facility to set one
particular fixmap entry to the required mapping - that can be done safely.

Cc: Borislav Petkov <bp@suse.de>
Cc: Len Brown <lenb@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b39ab98e

Merge tag 'mmc-v4.14-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · daea3daa

由 Linus Torvalds 提交于 10月 30, 2017

Pull MMC fixes from Ulf Hansson:
 "A couple of MMC host fixes intended for v4.14-rc8:

   - renesas_sdhi: fix kernel panic
   - tmio: fix swiotlb buffer is full"

* tag 'mmc-v4.14-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: renesas_sdhi: fix kernel panic in _internal_dmac.c
  mmc: tmio: fix swiotlb buffer is full

daea3daa

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 1960e8ea

由 Linus Torvalds 提交于 10月 30, 2017

Pull crypto fix from Herbert Xu:
 "This fixes an objtool regression"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/chacha20 - satisfy stack validation 2.0

1960e8ea

30 10月, 2017 2 次提交

PM / QoS: Fix default runtime_pm device resume latency · 2a9a86d5

由 Tero Kristo 提交于 10月 30, 2017

The recent change to the PM QoS framework to introduce a proper
no constraint value overlooked to handle the devices which don't
implement PM QoS OPS.  Runtime PM is one of the more severely
impacted subsystems, failing every attempt to runtime suspend
a device.  This leads into some nasty second level issues like
probe failures and increased power consumption among other
things.

Fix this by adding a proper return value for devices that don't
implement PM QoS.

Fixes: 0cc2b4e5 (PM / QoS: Fix device resume latency PM QoS)
Signed-off-by: NTero Kristo <t-kristo@ti.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2a9a86d5

L

Linux 4.14-rc7 · 0b07194b
由 Linus Torvalds 提交于 10月 29, 2017

0b07194b

29 10月, 2017 15 次提交

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 19e12196

由 Linus Torvalds 提交于 10月 29, 2017

Pull networking fixes from David Miller:

 1) Fix route leak in xfrm_bundle_create().

 2) In mac80211, validate user rate mask before configuring it. From
    Johannes Berg.

 3) Properly enforce memory limits in fair queueing code, from Toke
    Hoiland-Jorgensen.

 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.

 5) Fix TSO header allocation and management in mvpp2 driver, from Yan
    Markman.

 6) Don't take socket lock in BH handler in strparser code, from Tom
    Herbert.

 7) Don't show sockets from other namespaces in AF_UNIX code, from
    Andrei Vagin.

 8) Fix double free in error path of tap_open(), from Girish Moodalbail.

 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
    and Alexander Duyck.

10) Fix DCB mode programming in stmmac driver, from Jose Abreu.

11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
    Long.

12) Properly align SKB head before building SKB in tuntap, from Jason
    Wang.

13) Avoid matching qdiscs with a zero handle during lookups, from Cong
    Wang.

14) Fix various endianness bugs in sctp, from Xin Long.

15) Fix tc filter callback races and add selftests which trigger the
    problem, from Cong Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  selftests: Introduce a new test case to tc testsuite
  selftests: Introduce a new script to generate tc batch file
  net_sched: fix call_rcu() race on act_sample module removal
  net_sched: add rtnl assertion to tcf_exts_destroy()
  net_sched: use tcf_queue_work() in tcindex filter
  net_sched: use tcf_queue_work() in rsvp filter
  net_sched: use tcf_queue_work() in route filter
  net_sched: use tcf_queue_work() in u32 filter
  net_sched: use tcf_queue_work() in matchall filter
  net_sched: use tcf_queue_work() in fw filter
  net_sched: use tcf_queue_work() in flower filter
  net_sched: use tcf_queue_work() in flow filter
  net_sched: use tcf_queue_work() in cgroup filter
  net_sched: use tcf_queue_work() in bpf filter
  net_sched: use tcf_queue_work() in basic filter
  net_sched: introduce a workqueue for RCU callbacks of tc filter
  sctp: fix some type cast warnings introduced since very beginning
  sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
  sctp: fix some type cast warnings introduced by transport rhashtable
  sctp: fix some type cast warnings introduced by stream reconf
  ...

19e12196

Merge branch 'net_sched-fix-races-with-RCU-callbacks' · 6c325f4e

由 David S. Miller 提交于 10月 29, 2017

Cong Wang says:

====================
net_sched: fix races with RCU callbacks

Recently, the RCU callbacks used in TC filters and TC actions keep
drawing my attention, they introduce at least 4 race condition bugs:

1. A simple one fixed by Daniel:

commit c78e1746
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Wed May 20 17:13:33 2015 +0200

    net: sched: fix call_rcu() race on classifier module unloads

2. A very nasty one fixed by me:

commit 1697c4bb
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Mon Sep 11 16:33:32 2017 -0700

    net_sched: carefully handle tcf_block_put()

3. Two more bugs found by Chris:
https://patchwork.ozlabs.org/patch/826696/
https://patchwork.ozlabs.org/patch/826695/

Usually RCU callbacks are simple, however for TC filters and actions,
they are complex because at least TC actions could be destroyed
together with the TC filter in one callback. And RCU callbacks are
invoked in BH context, without locking they are parallel too. All of
these contribute to the cause of these nasty bugs.

Alternatively, we could also:

a) Introduce a spinlock to serialize these RCU callbacks. But as I
said in commit 1697c4bb ("net_sched: carefully handle
tcf_block_put()"), it is very hard to do because of tcf_chain_dump().
Potentially we need to do a lot of work to make it possible (if not
impossible).

b) Just get rid of these RCU callbacks, because they are not
necessary at all, callers of these call_rcu() are all on slow paths
and holding RTNL lock, so blocking is allowed in their contexts.
However, David and Eric dislike adding synchronize_rcu() here.

As suggested by Paul, we could defer the work to a workqueue and
gain the permission of holding RTNL again without any performance
impact, however, in tcf_block_put() we could have a deadlock when
flushing workqueue while hodling RTNL lock, the trick here is to
defer the work itself in workqueue and make it queued after all
other works so that we keep the same ordering to avoid any
use-after-free. Please see the first patch for details.

Patch 1 introduces the infrastructure, patch 2~12 move each
tc filter to the new tc filter workqueue, patch 13 adds
an assertion to catch potential bugs like this, patch 14
closes another rcu callback race, patch 15 and patch 16 add
new test cases.
====================
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c325f4e

selftests: Introduce a new test case to tc testsuite · 31c2611b

由 Chris Mi 提交于 10月 26, 2017

In this patchset, we fixed a tc bug. This patch adds the test case
that reproduces the bug. To run this test case, user should specify
an existing NIC device:
  # sudo ./tdc.py -d enp4s0f0

This test case belongs to category "flower". If user doesn't specify
a NIC device, the test cases belong to "flower" will not be run.

In this test case, we create 1M filters and all filters share the same
action. When destroying all filters, kernel should not panic. It takes
about 18s to run it.
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NLucas Bates <lucasb@mojatatu.com>
Signed-off-by: NChris Mi <chrism@mellanox.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31c2611b

selftests: Introduce a new script to generate tc batch file · 7f071998

由 Chris Mi 提交于 10月 26, 2017

  # ./tdc_batch.py -h
  usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file

  TC batch file generator

  positional arguments:
    device                device name
    file                  batch file name

  optional arguments:
    -h, --help            show this help message and exit
    -n NUMBER, --number NUMBER
                          how many lines in batch file
    -o, --skip_sw         skip_sw (offload), by default skip_hw
    -s, --share_action    all filters share the same action
    -p, --prio            all filters have different prio
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NLucas Bates <lucasb@mojatatu.com>
Signed-off-by: NChris Mi <chrism@mellanox.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f071998

net_sched: fix call_rcu() race on act_sample module removal · 46e235c1

由 Cong Wang 提交于 10月 26, 2017

Similar to commit c78e1746
("net: sched: fix call_rcu() race on classifier module unloads"),
we need to wait for flying RCU callback tcf_sample_cleanup_rcu().

Cc: Yotam Gigi <yotamg@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46e235c1

net_sched: add rtnl assertion to tcf_exts_destroy() · 2d132eba

由 Cong Wang 提交于 10月 26, 2017

After previous patches, it is now safe to claim that
tcf_exts_destroy() is always called with RTNL lock.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d132eba

net_sched: use tcf_queue_work() in tcindex filter · 27ce4f05

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27ce4f05

net_sched: use tcf_queue_work() in rsvp filter · d4f84a41

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4f84a41

net_sched: use tcf_queue_work() in route filter · c2f3f31d

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2f3f31d

net_sched: use tcf_queue_work() in u32 filter · c0d378ef

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0d378ef

net_sched: use tcf_queue_work() in matchall filter · df2735ee

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df2735ee

net_sched: use tcf_queue_work() in fw filter · e071dff2

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e071dff2

net_sched: use tcf_queue_work() in flower filter · 0552c8af

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0552c8af

net_sched: use tcf_queue_work() in flow filter · 94cdb475

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94cdb475

net_sched: use tcf_queue_work() in cgroup filter · b1b5b04f

由 Cong Wang 提交于 10月 26, 2017

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: NChris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1b5b04f

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功