You need to sign in or sign up before continuing.
- 01 5月, 2020 1 次提交
-
-
由 Peter Zijlstra 提交于
In order to change the {JMP,CALL}_NOSPEC macros to call out-of-line versions of the retpoline magic, we need to remove the '%' from the argument, such that we can paste it onto symbol names. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.151623523@infradead.org
-
- 08 4月, 2020 1 次提交
-
-
由 Masahiro Yamada 提交于
The code, #undef CONFIG_OPTIMIZE_INLINING, is not working as expected because <linux/compiler_types.h> is parsed before vclock_gettime.c since 28128c61 ("kconfig.h: Include compiler types to avoid missed struct attributes"). Since then, <linux/compiler_types.h> is included really early by using the '-include' option. So, you cannot negate the decision of <linux/compiler_types.h> in this way. You can confirm it by checking the pre-processed code, like this: $ make arch/x86/entry/vdso/vdso32/vclock_gettime.i There is no difference with/without CONFIG_CC_OPTIMIZE_FOR_SIZE. It is about two years since 28128c61. Nobody has reported a problem (or, nobody has even noticed the fact that this code is not working). It is ugly and unreliable to attempt to undefine a CONFIG option from C files, and anyway the inlining heuristic is up to the compiler. Just remove the broken code. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NNathan Chancellor <natechancellor@gmail.com> Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: David Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/20200220110807.32534-1-masahiroy@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 3月, 2020 1 次提交
-
-
由 H.J. Lu 提交于
With the command-line option -mx86-used-note=yes which can also be enabled at binutils build time with: --enable-x86-used-note generate GNU x86 used ISA and feature properties the x86 assembler in binutils 2.32 and above generates a program property note in a note section, .note.gnu.property, to encode used x86 ISAs and features. But kernel linker script only contains a single NOTE segment: PHDRS { text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */ dynamic PT_DYNAMIC FLAGS(4); /* PF_R */ note PT_NOTE FLAGS(4); /* PF_R */ eh_frame_hdr 0x6474e550; } The NOTE segment generated by the vDSO linker script is aligned to 4 bytes. But the .note.gnu.property section must be aligned to 8 bytes on x86-64: [hjl@gnu-skx-1 vdso]$ readelf -n vdso64.so Displaying notes found in: .note Owner Data size Description Linux 0x00000004 Unknown note type: (0x00000000) description data: 06 00 00 00 readelf: Warning: note with invalid namesz and/or descsz found at offset 0x20 readelf: Warning: type: 0x78, namesize: 0x00000100, descsize: 0x756e694c, alignment: 8 Since the note.gnu.property section in the vDSO is not checked by the dynamic linker, discard the .note.gnu.property sections in the vDSO. [ bp: Massage. ] Signed-off-by: NH.J. Lu <hjl.tools@gmail.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NKees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200326174314.254662-1-hjl.tools@gmail.com
-
- 25 3月, 2020 1 次提交
-
-
由 Masahiro Yamada 提交于
Add SPDX License Identifier to all .gitignore files. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 3月, 2020 14 次提交
-
-
由 Peter Zijlstra 提交于
Because moar '_' isn't always moar readable. git grep -l "___preempt_schedule\(_notrace\)*" | while read file; do sed -ie 's/___preempt_schedule\(_notrace\)*/preempt_schedule\1_thunk/g' $file; done Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NWill Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200320115858.995685950@infradead.org
-
由 Brian Gerst 提交于
asmlinkage is no longer required since the syscall ABI is now fully under x86 architecture control. This makes the 32-bit native syscalls a bit more effecient by passing in regs via EAX instead of on the stack. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-18-brgerst@gmail.com
-
由 Brian Gerst 提交于
Enable pt_regs based syscalls for 32-bit. This makes the 32-bit native kernel consistent with the 64-bit kernel, and improves the syscall interface by not needing to push all 6 potential arguments onto the stack. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/20200313195144.164260-17-brgerst@gmail.com
-
由 Brian Gerst 提交于
For the 32-bit syscall interface, 64-bit arguments (loff_t) are passed via a pair of 32-bit registers. These register pairs end up in consecutive stack slots, which matches the C ABI for 64-bit arguments. But when accessing the registers directly from pt_regs, the wrapper needs to manually reassemble the 64-bit value. These wrappers already exist for 32-bit compat, so make them available to 32-bit native in preparation for enabling pt_regs-based syscalls. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/20200313195144.164260-16-brgerst@gmail.com
-
由 Brian Gerst 提交于
Rename the syscalls that only exist for 32-bit from x86_* to ia32_* to make it clear they are for 32-bit only. Also rename the functions to match the syscall name. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/20200313195144.164260-15-brgerst@gmail.com
-
由 Brian Gerst 提交于
After removal of the __ia32_ prefix, remove compat entries that are now identical to the native entry. Converted with this script and fixing up whitespace: while read nr abi name entry compat; do if [ "${nr:0:1}" = "#" ]; then echo $nr $abi $name $entry $compat continue fi if [ "$entry" = "$compat" ]; then compat="" fi echo "$nr $abi $name $entry $compat" done Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-14-brgerst@gmail.com
-
由 Brian Gerst 提交于
Move the ABI prefixes to the __SYSCALL_[abi]() macros. This allows removal of the need to strip the prefix for UML. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-13-brgerst@gmail.com
-
由 Brian Gerst 提交于
Add a __SYSCALL_COMMON() macro to the syscall table, which simplifies syscalltbl.sh. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-12-brgerst@gmail.com
-
由 Brian Gerst 提交于
Syscall qualifier support is no longer needed. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/20200313195144.164260-11-brgerst@gmail.com
-
由 Brian Gerst 提交于
Now that the fast syscall path is removed, the ptregs qualifier is unused. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/20200313195144.164260-10-brgerst@gmail.com
-
由 Brian Gerst 提交于
Instead of using an array in asm-offsets to calculate the max syscall number, calculate it when writing out the syscall headers. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-9-brgerst@gmail.com
-
由 Brian Gerst 提交于
Since X32 has its own syscall table now, move it to a separate file. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/20200313195144.164260-8-brgerst@gmail.com
-
由 Brian Gerst 提交于
so it can be available to multiple syscall tables. Also directly return -ENOSYS instead of bouncing to the generic sys_ni_syscall(). Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-7-brgerst@gmail.com
-
由 Brian Gerst 提交于
Add missing syscall wrapper for x32_rt_sigreturn(). Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-6-brgerst@gmail.com
-
- 10 3月, 2020 2 次提交
-
-
由 Thomas Gleixner 提交于
User space cannot disable interrupts any longer so trace return to user space unconditionally as IRQS_ON. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200308222609.314596327@linutronix.de
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200308222609.219366430@linutronix.de
-
- 29 2月, 2020 1 次提交
-
-
由 Thomas Gleixner 提交于
Nothing cares about the -1 "mark as interrupt" in the errorcode of exception entries. It's only used to fill the error code when a signal is delivered, but this is already inconsistent vs. 64 bit as there all exceptions which do not have an error code set it to 0. So if 32 bit applications would care about this, then they would have noticed more than a decade ago. Just use 0 for all excpetions which do not have an errorcode consistently. This does neither break /proc/$PID/syscall because this interface examines the error code / syscall number which is on the stack and that is set to -1 (no syscall) in common_exception unconditionally for all exceptions. The push in the entry stub is just there to fill the hardware error code slot on the stack for consistency of the stack layout. A transient observation of 0 is possible, but that's true for the other exceptions which use 0 already as well and that interface is an unreliable snapshot of dubious correctness anyway. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/87mu94m7ky.fsf@nanos.tec.linutronix.de
-
- 27 2月, 2020 3 次提交
-
-
由 Thomas Gleixner 提交于
int3 is not using the common_exception path for purely historical reasons, but there is no reason to keep it the only exception which is different. Make it use common_exception so the upcoming changes to autogenerate the entry stubs do not have to special case int3. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200225220217.042369808@linutronix.de
-
由 Thomas Gleixner 提交于
Remove the pointless difference between 32 and 64 bit to make further unifications simpler. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200225220216.428188397@linutronix.de
-
由 Thomas Gleixner 提交于
All exception entry points must have ASM_CLAC right at the beginning. The general_protection entry is missing one. Fixes: e59d1b0a ("x86-32, smap: Add STAC/CLAC instructions to 32-bit kernel entry") Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200225220216.219537887@linutronix.de
-
- 18 2月, 2020 1 次提交
-
-
由 Benjamin Thiel 提交于
.. in order to fix a couple of -Wmissing-prototypes warnings. No functional change. [ bp: Massage commit message and drop newlines. ] Signed-off-by: NBenjamin Thiel <b.thiel@posteo.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200123152754.20149-1-b.thiel@posteo.de
-
- 17 2月, 2020 2 次提交
-
-
由 Thomas Gleixner 提交于
Switch to the generic VDSO clock mode storage. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> (VDSO parts) Acked-by: Juergen Gross <jgross@suse.com> (Xen parts) Acked-by: Paolo Bonzini <pbonzini@redhat.com> (KVM parts) Link: https://lkml.kernel.org/r/20200207124403.152039903@linutronix.de
-
由 Thomas Gleixner 提交于
All architectures which use the generic VDSO code have their own storage for the VDSO clock mode. That's pointless and just requires duplicate code. X86 abuses the function which retrieves the architecture specific clock mode storage to mark the clocksource as used in the VDSO. That's silly because this is invoked on every tick when the VDSO data is updated. Move this functionality to the clocksource::enable() callback so it gets invoked once when the clocksource is installed. This allows to make the clock mode storage generic. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mikelley@microsoft.com> (Hyper-V parts) Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> (VDSO parts) Acked-by: Juergen Gross <jgross@suse.com> (Xen parts) Link: https://lkml.kernel.org/r/20200207124402.934519777@linutronix.de
-
- 04 2月, 2020 1 次提交
-
-
由 Masahiro Yamada 提交于
In old days, the "host-progs" syntax was used for specifying host programs. It was renamed to the current "hostprogs-y" in 2004. It is typically useful in scripts/Makefile because it allows Kbuild to selectively compile host programs based on the kernel configuration. This commit renames like follows: always -> always-y hostprogs-y -> hostprogs So, scripts/Makefile will look like this: always-$(CONFIG_BUILD_BIN2C) += ... always-$(CONFIG_KALLSYMS) += ... ... hostprogs := $(always-y) $(always-m) I think this makes more sense because a host program is always a host program, irrespective of the kernel configuration. We want to specify which ones to compile by CONFIG options, so always-y will be handier. The "always", "hostprogs-y", "hostprogs-m" will be kept for backward compatibility for a while. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 18 1月, 2020 1 次提交
-
-
由 Aleksa Sarai 提交于
/* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). Userspace also has a hard time figuring out whether a particular flag is supported on a particular kernel. While it is now possible with contemporary kernels (thanks to [3]), older kernels will expose unknown flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during openat(2) time matches modern syscall designs and is far more fool-proof. In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. We'd therefore like to add a new flag argument. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). It also only contains u64s (even though ->mode arguably should be a u16) to avoid having padding fields which are never used in the future. Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). After input from Florian Weimer, the new open_how and flag definitions are inside a separate header from uapi/linux/fcntl.h, to avoid problems that glibc has with importing that header. /* Testing. */ In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking auto-mount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[5]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. Another possible avenue of future work would be some kind of CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace which openat2(2) flags and fields are supported by the current kernel (to avoid userspace having to go through several guesses to figure it out). [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: commit 629e014b ("fs: completely ignore unknown open flags") [4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ [6]: https://youtu.be/ggD-eb3yPVsSuggested-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NAleksa Sarai <cyphar@cyphar.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 14 1月, 2020 7 次提交
-
-
由 Dmitry Safonov 提交于
The VVAR page layout depends on whether a task belongs to the root or non-root time namespace. Whenever a task changes its namespace, the VVAR page tables are cleared and then they will be re-faulted with a corresponding layout. Co-developed-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-27-dima@arista.com
-
由 Dmitry Safonov 提交于
As timens page has offsets to data on VVAR page VVAR is going to be accessed shortly. Set it up with timens in one page fault as optimization. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Co-developed-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-26-dima@arista.com
-
由 Dmitry Safonov 提交于
If a task belongs to a time namespace then the VVAR page which contains the system wide VDSO data is replaced with a namespace specific page which has the same layout as the VVAR page. Co-developed-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-25-dima@arista.com
-
由 Dmitry Safonov 提交于
To support time namespaces in the VDSO with a minimal impact on regular non time namespace affected tasks, the namespace handling needs to be hidden in a slow path. The most obvious place is vdso_seq_begin(). If a task belongs to a time namespace then the VVAR page which contains the system wide VDSO data is replaced with a namespace specific page which has the same layout as the VVAR page. That page has vdso_data->seq set to 1 to enforce the slow path and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time namespace handling path. The extra check in the case that vdso_data->seq is odd, e.g. a concurrent update of the VDSO data is in progress, is not really affecting regular tasks which are not part of a time namespace as the task is spin waiting for the update to finish and vdso_data->seq to become even again. If a time namespace task hits that code path, it invokes the corresponding time getter function which retrieves the real VVAR page, reads host time and then adds the offset for the requested clock which is stored in the special VVAR page. Allocate the time namespace page among VVAR pages and place vdso_data on it. Provide __arch_get_timens_vdso_data() helper for VDSO code to get the code-relative position of VVARs on that special page. Co-developed-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-23-dima@arista.com
-
由 Dmitry Safonov 提交于
VDSO support for time namespaces needs to set up a page with the same layout as VVAR. That timens page will be placed on position of VVAR page inside namespace. That page has vdso_data->seq set to 1 to enforce the slow path and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time namespace handling path. To prepare the time namespace page the kernel needs to know the vdso_data offset. Provide arch_get_vdso_data() helper for locating vdso_data on VVAR page. Co-developed-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-22-dima@arista.com
-
由 Dmitry Safonov 提交于
Forbid splitting VVAR VMA resulting in a stricter ABI and reducing the amount of corner-cases to consider while working further on VDSO time namespace support. As the offset from timens to VVAR page is computed compile-time, the pages in VVAR should stay together and not being partically mremap()'ed. Co-developed-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-20-dima@arista.com
-
由 Sargun Dhillon 提交于
This wires up the pidfd_getfd syscall for all architectures. Signed-off-by: NSargun Dhillon <sargun@sargun.me> Acked-by: NChristian Brauner <christian.brauner@ubuntu.com> Reviewed-by: NArnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200107175927.4558-4-sargun@sargun.meSigned-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
- 09 1月, 2020 1 次提交
-
-
由 Jan Beulich 提交于
ignore_sysret() contains an unsuffixed SYSRET instruction. gas correctly interprets this as SYSRETL, but leaving it up to gas to guess when there is no register operand that implies a size is bad practice, and upstream gas is likely to warn about this in the future. Use SYSRETL explicitly. This does not change the assembled output. Signed-off-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NAndy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/038a7c35-062b-a285-c6d2-653b56585844@suse.com
-
- 29 12月, 2019 1 次提交
-
-
由 Valdis Klētnieks 提交于
When building with C=1, sparse issues a warning: CHECK arch/x86/entry/vdso/vdso32-setup.c arch/x86/entry/vdso/vdso32-setup.c:28:28: warning: symbol 'vdso32_enabled' was not declared. Should it be static? Provide the missing header file. Signed-off-by: NValdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/36224.1575599767@turing-police
-
- 27 11月, 2019 2 次提交
-
-
由 Borislav Petkov 提交于
Signed-off-by: NBorislav Petkov <bp@alien8.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
The old x86_32 doublefault_fn() was old and crufty, and it did not even try to recover. do_double_fault() is much nicer. Rewrite the 32-bit double fault code to sanitize CPU state and call do_double_fault(). This is mostly an exercise i386 archaeology. With this patch applied, 32-bit double faults get a real stack trace, just like 64-bit double faults. [ mingo: merged the patch to a later kernel base. ] Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-