提交 · 4983e5d74c821780d518232eea4acdc4a8f0b44d · openeuler / Kernel

11 6月, 2020 11 次提交

x86/entry: Move irq flags tracing to prepare_exit_to_usermode() · 4983e5d7

由 Thomas Gleixner 提交于 3月 04, 2020

This is another step towards more C-code and less convoluted ASM.

Similar to the entry path, invoke the tracer before context tracking which
might turn off RCU and invoke lockdep as the last step before going back to
user space. Annotate the code sections in exit_to_user_mode() accordingly
so objtool won't complain about the tracer invocation.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505134340.703783926@linutronix.de

4983e5d7

x86/entry: Move irq tracing on syscall entry to C-code · dd8e2d9a

由 Thomas Gleixner 提交于 2月 25, 2020

Now that the C entry points are safe, move the irq flags tracing code into
the entry helper:

    - Invoke lockdep before calling into context tracking

    - Use the safe trace_hardirqs_on_prepare() trace function after context
      tracking established state and RCU is watching.

enter_from_user_mode() is also still invoked from the exception/interrupt
entry code which still contains the ASM irq flags tracing. So this is just
a redundant and harmless invocation of tracing / lockdep until these are
removed as well.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.611961721@linutronix.de

dd8e2d9a

x86/entry/common: Protect against instrumentation · 8f159f1d

由 Thomas Gleixner 提交于 3月 10, 2020

Mark the various syscall entries with noinstr to protect them against
instrumentation and add the noinstrumentation_begin()/end() annotations to mark the
parts of the functions which are safe to call out into instrumentable code.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.520277507@linutronix.de

8f159f1d

x86/entry: Mark enter_from_user_mode() noinstr · 1723be30

由 Thomas Gleixner 提交于 2月 29, 2020

Both the callers in the low level ASM code and __context_tracking_exit()
which is invoked from enter_from_user_mode() via user_exit_irqoff() are
marked NOKPROBE. Allowing enter_from_user_mode() to be probed is
inconsistent at best.

Aside of that while function tracing per se is safe the function trace
entry/exit points can be used via BPF as well which is not safe to use
before context tracking has reached CONTEXT_KERNEL and adjusted RCU.

Mark it noinstr which moves it into the instrumentation protected text
section and includes notrace.

Note, this needs further fixups in context tracking to ensure that the
full call chain is protected. Will be addressed in follow up changes.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.429059405@linutronix.de

1723be30

x86/entry/32: Move non entry code into .text section · 8c0fa8a0

由 Thomas Gleixner 提交于 3月 25, 2020

All ASM code which is not part of the entry functionality can move out into
the .text section. No reason to keep it in the non-instrumentable entry
section.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.320164650@linutronix.de

8c0fa8a0

x86/entry/64: Move non entry code into .text section · b9f6976b

由 Thomas Gleixner 提交于 3月 25, 2020

All ASM code which is not part of the entry functionality can move out into
the .text section. No reason to keep it in the non-instrumentable entry
section.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.227579223@linutronix.de

b9f6976b

x86/entry: Exclude low level entry code from sanitizing · 20355e5f

由 Peter Zijlstra 提交于 3月 05, 2020

The sanitizers are not really applicable to the fragile low level entry
code. Entry code needs to carefully setup a normal 'runtime' environment.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505134059.970057117@linutronix.de

20355e5f

x86/entry: Remove the unused LOCKDEP_SYSEXIT cruft · 44d7e4fb

由 Thomas Gleixner 提交于 3月 05, 2020

No users left since two years due to commit 21d375b6 ("x86/entry/64:
Remove the SYSCALL64 fast path")
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134059.061301403@linutronix.de

44d7e4fb

x86/entry/64: Avoid pointless code when CONTEXT_TRACKING=n · 72500589

由 Thomas Gleixner 提交于 2月 25, 2020

GAS cannot optimize out the test and conditional jump when context tracking
is disabled and CALL_enter_from_user_mode is an empty macro.

Wrap it in #ifdeffery. Will go away once all this is moved to C.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NFrederic Weisbecker <frederic@kernel.org>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134058.955968069@linutronix.de

72500589

x86/entry/64: Remove unneeded kernel CR3 switching · c7589070

由 Lai Jiangshan 提交于 4月 19, 2020

When native_load_gs_index() fails on .Lgs_change, CR3 must be kernel
CR3. No need to switch it.
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200419144049.1906-2-laijs@linux.alibaba.com

c7589070

x86/entry/64: Remove an unused label · 26fa1263

由 Lai Jiangshan 提交于 4月 19, 2020

The label .Lcommon_\sym was introduced by 39e95433.
(x86-64: Reduce amount of redundant code generated for invalidate_interruptNN)
And all the other relevant information was removed by 52aec330
(x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR)
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200419144049.1906-4-laijs@linux.alibaba.com

26fa1263

10 6月, 2020 1 次提交

mmap locking API: use coccinelle to convert mmap_sem rwsem call sites · d8ed45c5

由 Michel Lespinasse 提交于 6月 08, 2020

This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)
Signed-off-by: NMichel Lespinasse <walken@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: NLaurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8ed45c5

19 5月, 2020 1 次提交

x86/kvm: Handle async page faults directly through do_page_fault() · ef68017e

由 Andy Lutomirski 提交于 2月 28, 2020

KVM overloads #PF to indicate two types of not-actually-page-fault
events.  Right now, the KVM guest code intercepts them by modifying
the IDT and hooking the #PF vector.  This makes the already fragile
fault code even harder to understand, and it also pollutes call
traces with async_page_fault and do_async_page_fault for normal page
faults.

Clean it up by moving the logic into do_page_fault() using a static
branch.  This gets rid of the platform trap_init override mechanism
completely.

[ tglx: Fixed up 32bit, removed error code from the async functions and
  	massaged coding style ]
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134059.169270470@linutronix.de

ef68017e

14 5月, 2020 1 次提交

vfs: add faccessat2 syscall · c8ffd8bc

由 Miklos Szeredi 提交于 5月 14, 2020

POSIX defines faccessat() as having a fourth "flags" argument, while the
linux syscall doesn't have it.  Glibc tries to emulate AT_EACCESS and
AT_SYMLINK_NOFOLLOW, but AT_EACCESS emulation is broken.

Add a new faccessat(2) syscall with the added flags argument and implement
both flags.

The value of AT_EACCESS is defined in glibc headers to be the same as
AT_REMOVEDIR.  Use this value for the kernel interface as well, together
with the explanatory comment.

Also add AT_EMPTY_PATH support, which is not documented by POSIX, but can
be useful and is trivial to implement.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c8ffd8bc

01 5月, 2020 1 次提交

x86: Change {JMP,CALL}_NOSPEC argument · 34fdce69

由 Peter Zijlstra 提交于 4月 22, 2020

In order to change the {JMP,CALL}_NOSPEC macros to call out-of-line
versions of the retpoline magic, we need to remove the '%' from the
argument, such that we can paste it onto symbol names.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200428191700.151623523@infradead.org

34fdce69

25 4月, 2020 5 次提交

x86/unwind/orc: Fix premature unwind stoppage due to IRET frames · 81b67439

由 Josh Poimboeuf 提交于 4月 25, 2020

The following execution path is possible:

  fsnotify()
    [ realign the stack and store previous SP in R10 ]
    <IRQ>
      [ only IRET regs saved ]
      common_interrupt()
        interrupt_entry()
	  <NMI>
	    [ full pt_regs saved ]
	    ...
	    [ unwind stack ]

When the unwinder goes through the NMI and the IRQ on the stack, and
then sees fsnotify(), it doesn't have access to the value of R10,
because it only has the five IRET registers.  So the unwind stops
prematurely.

However, because the interrupt_entry() code is careful not to clobber
R10 before saving the full regs, the unwinder should be able to read R10
from the previously saved full pt_regs associated with the NMI.

Handle this case properly.  When encountering an IRET regs frame
immediately after a full pt_regs frame, use the pt_regs as a backup
which can be used to get the C register values.

Also, note that a call frame resets the 'prev_regs' value, because a
function is free to clobber the registers.  For this fix to work, the
IRET and full regs frames must be adjacent, with no FUNC frames in
between.  So replace the FUNC hint in interrupt_entry() with an
IRET_REGS hint.

Fixes: ee9f8fce ("x86/unwind: Add the ORC unwinder")
Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/97a408167cc09f1cfa0de31a7b70dd88868d743f.1587808742.git.jpoimboe@redhat.com

81b67439

x86/entry/64: Fix unwind hints in rewind_stack_do_exit() · f977df7b

由 Jann Horn 提交于 4月 25, 2020

The LEAQ instruction in rewind_stack_do_exit() moves the stack pointer
directly below the pt_regs at the top of the task stack before calling
do_exit(). Tell the unwinder to expect pt_regs.

Fixes: 8c1f7558 ("x86/entry/64: Add unwind hint annotations")
Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJann Horn <jannh@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/68c33e17ae5963854916a46f522624f8e1d264f2.1587808742.git.jpoimboe@redhat.com

f977df7b

x86/entry/64: Fix unwind hints in __switch_to_asm() · 96c64806

由 Josh Poimboeuf 提交于 4月 25, 2020

UNWIND_HINT_FUNC has some limitations: specifically, it doesn't reset
all the registers to undefined.  This causes objtool to get confused
about the RBP push in __switch_to_asm(), resulting in bad ORC data.

While __switch_to_asm() does do some stack magic, it's otherwise a
normal callable-from-C function, so just annotate it as a function,
which makes objtool happy and allows it to produces the correct hints
automatically.

Fixes: 8c1f7558 ("x86/entry/64: Add unwind hint annotations")
Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/03d0411920d10f7418f2e909210d8e9a3b2ab081.1587808742.git.jpoimboe@redhat.com

96c64806

x86/entry/64: Fix unwind hints in kernel exit path · 1fb14363

由 Josh Poimboeuf 提交于 4月 25, 2020

In swapgs_restore_regs_and_return_to_usermode, after the stack is
switched to the trampoline stack, the existing UNWIND_HINT_REGS hint is
no longer valid, which can result in the following ORC unwinder warning:

  WARNING: can't dereference registers at 000000003aeb0cdd for ip swapgs_restore_regs_and_return_to_usermode+0x93/0xa0

For full correctness, we could try to add complicated unwind hints so
the unwinder could continue to find the registers, but when when it's
this close to kernel exit, unwind hints aren't really needed anymore and
it's fine to just use an empty hint which tells the unwinder to stop.

For consistency, also move the UNWIND_HINT_EMPTY in
entry_SYSCALL_64_after_hwframe to a similar location.

Fixes: 3e3b9293 ("x86/entry/64: Return to userspace from the trampoline stack")
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Reported-by: NDave Jones <dsj@fb.com>
Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: NJoe Mario <jmario@redhat.com>
Reported-by: NJann Horn <jannh@google.com>
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/60ea8f562987ed2d9ace2977502fe481c0d7c9a0.1587808742.git.jpoimboe@redhat.com

1fb14363

x86/entry/64: Fix unwind hints in register clearing code · 06a9750e

由 Josh Poimboeuf 提交于 4月 25, 2020

The PUSH_AND_CLEAR_REGS macro zeroes each register immediately after
pushing it.  If an NMI or exception hits after a register is cleared,
but before the UNWIND_HINT_REGS annotation, the ORC unwinder will
wrongly think the previous value of the register was zero.  This can
confuse the unwinding process and cause it to exit early.

Because ORC is simpler than DWARF, there are a limited number of unwind
annotation states, so it's not possible to add an individual unwind hint
after each push/clear combination.  Instead, the register clearing
instructions need to be consolidated and moved to after the
UNWIND_HINT_REGS annotation.

Fixes: 3f01daec ("x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro")
Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/68fd3d0bc92ae2d62ff7879d15d3684217d51f08.1587808742.git.jpoimboe@redhat.com

06a9750e

22 4月, 2020 3 次提交

x86/vdso/Makefile: Add vobjs32 · cd2f45b7

由 Dmitry Safonov 提交于 4月 20, 2020

Treat ia32/i386 objects in array the same as 64-bit vdso objects.
Co-developed-by: NAndrei Vagin <avagin@openvz.org>
Signed-off-by: NAndrei Vagin <avagin@openvz.org>
Signed-off-by: NDmitry Safonov <dima@arista.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200420183256.660371-5-dima@arista.com

cd2f45b7

x86/vdso/vdso2c: Convert iterators to unsigned · 833e55bb

由 Dmitry Safonov 提交于 4月 20, 2020

`i` and `j` are used everywhere with unsigned types.

Convert `i` to unsigned long in order to avoid signed to unsigned
comparisons.  Convert `k` to unsigned int with the same purpose.
Also, drop `j` as `i` could be used in place of it.
Introduce syms_nr for readability.
Co-developed-by: NAndrei Vagin <avagin@openvz.org>
Signed-off-by: NAndrei Vagin <avagin@openvz.org>
Signed-off-by: NDmitry Safonov <dima@arista.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200420183256.660371-4-dima@arista.com

833e55bb

x86/vdso/vdso2c: Correct error messages on file open · 089ef557

由 Dmitry Safonov 提交于 4月 20, 2020

err() message in main() is misleading: it should print `outfilename`,
which is argv[3], not argv[2].

Correct error messages to be more precise about what failed and for
which file.
Co-developed-by: NAndrei Vagin <avagin@openvz.org>
Signed-off-by: NAndrei Vagin <avagin@openvz.org>
Signed-off-by: NDmitry Safonov <dima@arista.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200420183256.660371-2-dima@arista.com

089ef557

14 4月, 2020 1 次提交

x86/32: Remove CONFIG_DOUBLEFAULT · 59330942

由 Borislav Petkov 提交于 4月 04, 2020

Make the doublefault exception handler unconditional on 32-bit. Yes,
it is important to be able to catch #DF exceptions instead of silent
reboots. Yes, the code size increase is worth every byte. And one less
CONFIG symbol is just the cherry on top.

No functional changes.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200404083646.8897-1-bp@alien8.de

59330942

08 4月, 2020 1 次提交

sparc,x86: vdso: remove meaningless undefining CONFIG_OPTIMIZE_INLINING · 12a5b00a

由 Masahiro Yamada 提交于 4月 06, 2020

The code, #undef CONFIG_OPTIMIZE_INLINING, is not working as expected
because <linux/compiler_types.h> is parsed before vclock_gettime.c since
28128c61 ("kconfig.h: Include compiler types to avoid missed struct
attributes").

Since then, <linux/compiler_types.h> is included really early by using the
'-include' option.  So, you cannot negate the decision of
<linux/compiler_types.h> in this way.

You can confirm it by checking the pre-processed code, like this:

  $ make arch/x86/entry/vdso/vdso32/vclock_gettime.i

There is no difference with/without CONFIG_CC_OPTIMIZE_FOR_SIZE.

It is about two years since 28128c61.  Nobody has reported a problem
(or, nobody has even noticed the fact that this code is not working).

It is ugly and unreliable to attempt to undefine a CONFIG option from C
files, and anyway the inlining heuristic is up to the compiler.

Just remove the broken code.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20200220110807.32534-1-masahiroy@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

12a5b00a

27 3月, 2020 1 次提交

x86/vdso: Discard .note.gnu.property sections in vDSO · 4caffe6a

由 H.J. Lu 提交于 3月 26, 2020

With the command-line option -mx86-used-note=yes which can also be
enabled at binutils build time with:

  --enable-x86-used-note  generate GNU x86 used ISA and feature properties

the x86 assembler in binutils 2.32 and above generates a program property
note in a note section, .note.gnu.property, to encode used x86 ISAs and
features.  But kernel linker script only contains a single NOTE segment:

  PHDRS
  {
   text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
   dynamic PT_DYNAMIC FLAGS(4); /* PF_R */
   note PT_NOTE FLAGS(4); /* PF_R */
   eh_frame_hdr 0x6474e550;
  }

The NOTE segment generated by the vDSO linker script is aligned to 4 bytes.
But the .note.gnu.property section must be aligned to 8 bytes on x86-64:

  [hjl@gnu-skx-1 vdso]$ readelf -n vdso64.so

  Displaying notes found in: .note
    Owner                Data size 	Description
    Linux                0x00000004	Unknown note type: (0x00000000)
     description data: 06 00 00 00
  readelf: Warning: note with invalid namesz and/or descsz found at offset 0x20
  readelf: Warning:  type: 0x78, namesize: 0x00000100, descsize: 0x756e694c, alignment: 8

Since the note.gnu.property section in the vDSO is not checked by the
dynamic linker, discard the .note.gnu.property sections in the vDSO.

 [ bp: Massage. ]
Signed-off-by: NH.J. Lu <hjl.tools@gmail.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NKees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200326174314.254662-1-hjl.tools@gmail.com

4caffe6a

25 3月, 2020 1 次提交

.gitignore: add SPDX License Identifier · d198b34f

由 Masahiro Yamada 提交于 3月 03, 2020

Add SPDX License Identifier to all .gitignore files.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d198b34f

21 3月, 2020 13 次提交

x86/entry: Rename ___preempt_schedule · 46db36ab

由 Peter Zijlstra 提交于 3月 20, 2020

Because moar '_' isn't always moar readable.

git grep -l "___preempt_schedule\(_notrace\)*" | while read file;
do
	sed -ie 's/___preempt_schedule\(_notrace\)*/preempt_schedule\1_thunk/g' $file;
done
Reported-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200320115858.995685950@infradead.org

46db36ab

x86/entry: Drop asmlinkage from syscalls · 0f78ff17

由 Brian Gerst 提交于 3月 13, 2020

asmlinkage is no longer required since the syscall ABI is now fully under
x86 architecture control. This makes the 32-bit native syscalls a bit more
effecient by passing in regs via EAX instead of on the stack.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200313195144.164260-18-brgerst@gmail.com

0f78ff17

x86/entry/32: Enable pt_regs based syscalls · 25c619e5

由 Brian Gerst 提交于 3月 13, 2020

Enable pt_regs based syscalls for 32-bit. This makes the 32-bit native
kernel consistent with the 64-bit kernel, and improves the syscall
interface by not needing to push all 6 potential arguments onto the stack.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-17-brgerst@gmail.com

25c619e5

x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments · 121b32a5

由 Brian Gerst 提交于 3月 13, 2020

For the 32-bit syscall interface, 64-bit arguments (loff_t) are passed via
a pair of 32-bit registers. These register pairs end up in consecutive stack
slots, which matches the C ABI for 64-bit arguments. But when accessing the
registers directly from pt_regs, the wrapper needs to manually reassemble the
64-bit value. These wrappers already exist for 32-bit compat, so make them
available to 32-bit native in preparation for enabling pt_regs-based syscalls.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-16-brgerst@gmail.com

121b32a5

x86/entry/32: Rename 32-bit specific syscalls · 866128a9

由 Brian Gerst 提交于 3月 13, 2020

Rename the syscalls that only exist for 32-bit from x86_* to ia32_* to make it
clear they are for 32-bit only.  Also rename the functions to match the syscall
name.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-15-brgerst@gmail.com

866128a9

x86/entry/32: Clean up syscall_32.tbl · a845a6cf

由 Brian Gerst 提交于 3月 13, 2020

After removal of the __ia32_ prefix, remove compat entries that are now
identical to the native entry.

Converted with this script and fixing up whitespace:

while read nr abi name entry compat; do
    if [ "${nr:0:1}" = "#" ]; then
        echo $nr $abi $name $entry $compat
        continue
    fi
    if [ "$entry" = "$compat" ]; then
        compat=""
    fi
    echo "$nr	$abi	$name		$entry		$compat"
done
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-14-brgerst@gmail.com

a845a6cf

x86/entry: Remove ABI prefixes from functions in syscall tables · cab56d34

由 Brian Gerst 提交于 3月 13, 2020

Move the ABI prefixes to the __SYSCALL_[abi]() macros.  This allows removal
of the need to strip the prefix for UML.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-13-brgerst@gmail.com

cab56d34

x86/entry/64: Add __SYSCALL_COMMON() · 8210efcb

由 Brian Gerst 提交于 3月 13, 2020

Add a __SYSCALL_COMMON() macro to the syscall table, which simplifies syscalltbl.sh.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-12-brgerst@gmail.com

8210efcb

x86/entry: Remove syscall qualifier support · b5592e5c

由 Brian Gerst 提交于 3月 13, 2020

Syscall qualifier support is no longer needed.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-11-brgerst@gmail.com

b5592e5c

x86/entry/64: Remove ptregs qualifier from syscall table · d3b1b776

由 Brian Gerst 提交于 3月 13, 2020

Now that the fast syscall path is removed, the ptregs qualifier is unused.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-10-brgerst@gmail.com

d3b1b776

x86/entry: Move max syscall number calculation to syscallhdr.sh · 08720988

由 Brian Gerst 提交于 3月 13, 2020

Instead of using an array in asm-offsets to calculate the max syscall
number, calculate it when writing out the syscall headers.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-9-brgerst@gmail.com

08720988

x86/entry/64: Split X32 syscall table into its own file · 2e487c35

由 Brian Gerst 提交于 3月 13, 2020

Since X32 has its own syscall table now, move it to a separate file.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-8-brgerst@gmail.com

2e487c35

x86/entry/64: Move sys_ni_syscall stub to common.c · cc42c045

由 Brian Gerst 提交于 3月 13, 2020

so it can be available to multiple syscall tables.  Also directly return
-ENOSYS instead of bouncing to the generic sys_ni_syscall().
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-7-brgerst@gmail.com

cc42c045

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功