- 13 7月, 2019 1 次提交
-
-
由 Vlastimil Babka 提交于
When debug_pagealloc is enabled, we currently allocate the page_ext array to mark guard pages with the PAGE_EXT_DEBUG_GUARD flag. Now that we have the page_type field in struct page, we can use that instead, as guard pages are neither PageSlab nor mapped to userspace. This reduces memory overhead when debug_pagealloc is enabled and there are no other features requiring the page_ext array. Link: http://lkml.kernel.org/r/20190603143451.27353-4-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 7月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
The FSGSBASE series turned out to have serious bugs and there is still an open issue which is not fully understood yet. The confidence in those changes has become close to zero especially as the test cases which have been shipped with that series were obviously never run before sending the final series out to LKML. ./fsgsbase_64 >/dev/null Segmentation fault As the merge window is close, the only sane decision is to revert FSGSBASE support. The revert is necessary as this branch has been merged into perf/core already and rebasing all of that a few days before the merge window is not the most brilliant idea. I could definitely slap myself for not noticing the test case fail when merging that series, but TBH my expectations weren't that low back then. Won't happen again. Revert the following commits: 539bca53 ("x86/entry/64: Fix and clean up paranoid_exit") 2c7b5ac5 ("Documentation/x86/64: Add documentation for GS/FS addressing mode") f987c955 ("x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2") 2032f1f9 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit") 5bf0cab6 ("x86/entry/64: Document GSBASE handling in the paranoid path") 708078f6 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit") 79e1932f ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro") 1d07316b ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry") f60a83df ("x86/process/64: Use FSGSBASE instructions on thread copy and ptrace") 1ab5f3f7 ("x86/process/64: Use FSBSBASE in switch_to() if available") a86b4625 ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions") 8b71340d ("x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions") b64ed19b ("x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE") Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com>
-
- 28 6月, 2019 2 次提交
-
-
由 Andy Lutomirski 提交于
With vsyscall emulation on, a readable vsyscall page is still exposed that contains syscall instructions that validly implement the vsyscalls. This is required because certain dynamic binary instrumentation tools attempt to read the call targets of call instructions in the instrumented code. If the instrumented code uses vsyscalls, then the vsyscall page needs to contain readable code. Unfortunately, leaving readable memory at a deterministic address can be used to help various ASLR bypasses, so some hardening value can be gained by disallowing vsyscall reads. Given how rarely the vsyscall page needs to be readable, add a mechanism to make the vsyscall page be execute only. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/d17655777c21bc09a7af1bbcf74e6f2b69a51152.1561610354.git.luto@kernel.org
-
由 Andy Lutomirski 提交于
The vsyscall=native feature is gone -- remove the docs. Fixes: 076ca272 ("x86/vsyscall/64: Drop "native" vsyscalls") Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NKees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/d77c7105eb4c57c1a95a95b6a5b8ba194a18e764.1561610354.git.luto@kernel.org
-
- 22 6月, 2019 2 次提交
-
-
由 Andy Lutomirski 提交于
Now that FSGSBASE is fully supported, remove unsafe_fsgsbase, enable FSGSBASE by default, and add nofsgsbase to disable it. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://lkml.kernel.org/r/1557309753-24073-17-git-send-email-chang.seok.bae@intel.com
-
由 Andy Lutomirski 提交于
This is temporary. It will allow the next few patches to be tested incrementally. Setting unsafe_fsgsbase is a root hole. Don't do it. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://lkml.kernel.org/r/1557309753-24073-4-git-send-email-chang.seok.bae@intel.com
-
- 15 6月, 2019 6 次提交
-
-
由 Mauro Carvalho Chehab 提交于
Sphinx need to know when a paragraph ends. So, do some adjustments at the file for it to be properly parsed. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. that's said, I believe that this file should be moved to the GPU/DRM documentation. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Mauro Carvalho Chehab 提交于
Convert those documents and prepare them to be part of the kernel API book, as most of the stuff there are related to the Kernel interfaces. Still, in the future, it would make sense to split the docs, as some of the stuff is clearly focused on sysadmin tasks. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Reviewed-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Mauro Carvalho Chehab 提交于
Convert the cgroup-v1 files to ReST format, in order to allow a later addition to the admin-guide. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Mauro Carvalho Chehab 提交于
Convert kdump documentation to ReST and add it to the user faced manual, as the documents are mainly focused on sysadmins that would be enabling kdump. Note: the vmcoreinfo.rst has one very long title on one of its sub-sections: PG_lru|PG_private|PG_swapcache|PG_swapbacked|PG_slab|PG_hwpoision|PG_head_mask|PAGE_BUDDY_MAPCOUNT_VALUE(~PG_buddy)|PAGE_OFFLINE_MAPCOUNT_VALUE(~PG_offline) I opted to break this one, into two entries with the same content, in order to make it easier to display after being parsed in html and PDF. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Mauro Carvalho Chehab 提交于
The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Mauro Carvalho Chehab 提交于
The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Also, removed the Maintained by, as requested by Geert. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 11 6月, 2019 1 次提交
-
-
由 Mauro Carvalho Chehab 提交于
Convert all text files with s390 documentation to ReST format. Tried to preserve as much as possible the original document format. Still, some of the files required some work in order for it to be visible on both plain text and after converted to html. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 09 6月, 2019 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
The hisax driver got removed on 85993b8c ("isdn: remove hisax driver"), but a left-over was kept at kernel-parameters.txt. Fixes: 85993b8c ("isdn: remove hisax driver") Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Mauro Carvalho Chehab 提交于
Mostly due to x86 and acpi conversion, several documentation links are still pointing to the old file. Fix them. Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Reviewed-by: NWolfram Sang <wsa@the-dreams.de> Reviewed-by: NSven Van Asbroeck <TheSven73@gmail.com> Reviewed-by: NBhupesh Sharma <bhsharma@redhat.com> Acked-by: NMark Brown <broonie@kernel.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 30 5月, 2019 1 次提交
-
-
由 Zhenzhong Duan 提交于
The default behavior of hardlockup depends on the config of CONFIG_BOOTPARAM_HARDLOCKUP_PANIC. Fix the description of nmi_watchdog to make it clear. Suggested-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-doc@vger.kernel.org Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 26 5月, 2019 1 次提交
-
-
Some workloads need to change kthread priority for RCU core processing without affecting other softirq work. This commit therefore introduces the rcutree.use_softirq kernel boot parameter, which moves the RCU core work from softirq to a per-CPU SCHED_OTHER kthread named rcuc. Use of SCHED_OTHER approach avoids the scalability problems that appeared with the earlier attempt to move RCU core processing to from softirq to kthreads. That said, kernels built with RCU_BOOST=y will run the rcuc kthreads at the RCU-boosting priority. Note that rcutree.use_softirq=0 must be specified to move RCU core processing to the rcuc kthreads: rcutree.use_softirq=1 is the default. Reported-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> [ paulmck: Adjust for invoke_rcu_callbacks() only ever being invoked from RCU core processing, in contrast to softirq->rcuc transition in old mainline RCU priority boosting. ] [ paulmck: Avoid wakeups when scheduler might have invoked rcu_read_unlock() while holding rq or pi locks, also possibly fixing a pre-existing latent bug involving raise_softirq()-induced wakeups. ] Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
-
- 19 5月, 2019 1 次提交
-
-
由 Feng Tang 提交于
Currently on panic, kernel will lower the loglevel and print out pending printk msg only with console_flush_on_panic(). Add an option for users to configure the "panic_print" to replay all dmesg in buffer, some of which they may have never seen due to the loglevel setting, which will help panic debugging . [feng.tang@intel.com: keep the original console_flush_on_panic() inside panic()] Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com [feng.tang@intel.com: use logbuf lock to protect the console log index] Link: http://lkml.kernel.org/r/1556269868-22654-1-git-send-email-feng.tang@intel.com Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.comSigned-off-by: NFeng Tang <feng.tang@intel.com> Reviewed-by: NPetr Mladek <pmladek@suse.com> Cc: Aaro Koskinen <aaro.koskinen@nokia.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 5月, 2019 3 次提交
-
-
由 Waiman Long 提交于
The maximum number of unique System V IPC identifiers was limited to 32k. That limit should be big enough for most use cases. However, there are some users out there requesting for more, especially those that are migrating from Solaris which uses 24 bits for unique identifiers. To satisfy the need of those users, a new boot time kernel option "ipcmni_extend" is added to extend the IPCMNI value to 16M. This is a 512X increase which should be big enough for users out there that need a large number of unique IPC identifier. The use of this new option will change the pattern of the IPC identifiers returned by functions like shmget(2). An application that depends on such pattern may not work properly. So it should only be used if the users really need more than 32k of unique IPC numbers. This new option does have the side effect of reducing the maximum number of unique sequence numbers from 64k down to 128. So it is a trade-off. The computation of a new IPC id is not done in the performance critical path. So a little bit of additional overhead shouldn't have any real performance impact. Link: http://lkml.kernel.org/r/20190329204930.21620-1-longman@redhat.comSigned-off-by: NWaiman Long <longman@redhat.com> Acked-by: NManfred Spraul <manfred@colorfullife.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aaro Koskinen 提交于
Allow specifying reboot_mode for panic only. This is needed on systems where ramoops is used to store panic logs, and user wants to use warm reset to preserve those, while still having cold reset on normal reboots. Link: http://lkml.kernel.org/r/20190322004735.27702-1-aaro.koskinen@iki.fiSigned-off-by: NAaro Koskinen <aaro.koskinen@nokia.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Patch series "mm: Randomize free memory", v10. This patch (of 3): Randomization of the page allocator improves the average utilization of a direct-mapped memory-side-cache. Memory side caching is a platform capability that Linux has been previously exposed to in HPC (high-performance computing) environments on specialty platforms. In that instance it was a smaller pool of high-bandwidth-memory relative to higher-capacity / lower-bandwidth DRAM. Now, this capability is going to be found on general purpose server platforms where DRAM is a cache in front of higher latency persistent memory [1]. Robert offered an explanation of the state of the art of Linux interactions with memory-side-caches [2], and I copy it here: It's been a problem in the HPC space: http://www.nersc.gov/research-and-development/knl-cache-mode-performance-coe/ A kernel module called zonesort is available to try to help: https://software.intel.com/en-us/articles/xeon-phi-software and this abandoned patch series proposed that for the kernel: https://lkml.kernel.org/r/20170823100205.17311-1-lukasz.daniluk@intel.com Dan's patch series doesn't attempt to ensure buffers won't conflict, but also reduces the chance that the buffers will. This will make performance more consistent, albeit slower than "optimal" (which is near impossible to attain in a general-purpose kernel). That's better than forcing users to deploy remedies like: "To eliminate this gradual degradation, we have added a Stream measurement to the Node Health Check that follows each job; nodes are rebooted whenever their measured memory bandwidth falls below 300 GB/s." A replacement for zonesort was merged upstream in commit cc9aec03 ("x86/numa_emulation: Introduce uniform split capability"). With this numa_emulation capability, memory can be split into cache sized ("near-memory" sized) numa nodes. A bind operation to such a node, and disabling workloads on other nodes, enables full cache performance. However, once the workload exceeds the cache size then cache conflicts are unavoidable. While HPC environments might be able to tolerate time-scheduling of cache sized workloads, for general purpose server platforms, the oversubscribed cache case will be the common case. The worst case scenario is that a server system owner benchmarks a workload at boot with an un-contended cache only to see that performance degrade over time, even below the average cache performance due to excessive conflicts. Randomization clips the peaks and fills in the valleys of cache utilization to yield steady average performance. Here are some performance impact details of the patches: 1/ An Intel internal synthetic memory bandwidth measurement tool, saw a 3X speedup in a contrived case that tries to force cache conflicts. The contrived cased used the numa_emulation capability to force an instance of the benchmark to be run in two of the near-memory sized numa nodes. If both instances were placed on the same emulated they would fit and cause zero conflicts. While on separate emulated nodes without randomization they underutilized the cache and conflicted unnecessarily due to the in-order allocation per node. 2/ A well known Java server application benchmark was run with a heap size that exceeded cache size by 3X. The cache conflict rate was 8% for the first run and degraded to 21% after page allocator aging. With randomization enabled the rate levelled out at 11%. 3/ A MongoDB workload did not observe measurable difference in cache-conflict rates, but the overall throughput dropped by 7% with randomization in one case. 4/ Mel Gorman ran his suite of performance workloads with randomization enabled on platforms without a memory-side-cache and saw a mix of some improvements and some losses [3]. While there is potentially significant improvement for applications that depend on low latency access across a wide working-set, the performance may be negligible to negative for other workloads. For this reason the shuffle capability defaults to off unless a direct-mapped memory-side-cache is detected. Even then, the page_alloc.shuffle=0 parameter can be specified to disable the randomization on those systems. Outside of memory-side-cache utilization concerns there is potentially security benefit from randomization. Some data exfiltration and return-oriented-programming attacks rely on the ability to infer the location of sensitive data objects. The kernel page allocator, especially early in system boot, has predictable first-in-first out behavior for physical pages. Pages are freed in physical address order when first onlined. Quoting Kees: "While we already have a base-address randomization (CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and memory layouts would certainly be using the predictability of allocation ordering (i.e. for attacks where the base address isn't important: only the relative positions between allocated memory). This is common in lots of heap-style attacks. They try to gain control over ordering by spraying allocations, etc. I'd really like to see this because it gives us something similar to CONFIG_SLAB_FREELIST_RANDOM but for the page allocator." While SLAB_FREELIST_RANDOM reduces the predictability of some local slab caches it leaves vast bulk of memory to be predictably in order allocated. However, it should be noted, the concrete security benefits are hard to quantify, and no known CVE is mitigated by this randomization. Introduce shuffle_free_memory(), and its helper shuffle_zone(), to perform a Fisher-Yates shuffle of the page allocator 'free_area' lists when they are initially populated with free memory at boot and at hotplug time. Do this based on either the presence of a page_alloc.shuffle=Y command line parameter, or autodetection of a memory-side-cache (to be added in a follow-on patch). The shuffling is done in terms of CONFIG_SHUFFLE_PAGE_ORDER sized free pages where the default CONFIG_SHUFFLE_PAGE_ORDER is MAX_ORDER-1 i.e. 10, 4MB this trades off randomization granularity for time spent shuffling. MAX_ORDER-1 was chosen to be minimally invasive to the page allocator while still showing memory-side cache behavior improvements, and the expectation that the security implications of finer granularity randomization is mitigated by CONFIG_SLAB_FREELIST_RANDOM. The performance impact of the shuffling appears to be in the noise compared to other memory initialization work. This initial randomization can be undone over time so a follow-on patch is introduced to inject entropy on page free decisions. It is reasonable to ask if the page free entropy is sufficient, but it is not enough due to the in-order initial freeing of pages. At the start of that process putting page1 in front or behind page0 still keeps them close together, page2 is still near page1 and has a high chance of being adjacent. As more pages are added ordering diversity improves, but there is still high page locality for the low address pages and this leads to no significant impact to the cache conflict rate. [1]: https://itpeernetwork.intel.com/intel-optane-dc-persistent-memory-operating-modes/ [2]: https://lkml.kernel.org/r/AT5PR8401MB1169D656C8B5E121752FC0F8AB120@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM [3]: https://lkml.org/lkml/2018/10/12/309 [dan.j.williams@intel.com: fix shuffle enable] Link: http://lkml.kernel.org/r/154943713038.3858443.4125180191382062871.stgit@dwillia2-desk3.amr.corp.intel.com [cai@lca.pw: fix SHUFFLE_PAGE_ALLOCATOR help texts] Link: http://lkml.kernel.org/r/20190425201300.75650-1-cai@lca.pw Link: http://lkml.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NQian Cai <cai@lca.pw> Reviewed-by: NKees Cook <keescook@chromium.org> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Robert Elliott <elliott@hpe.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 5月, 2019 1 次提交
-
-
由 Josh Poimboeuf 提交于
Configure arm64 runtime CPU speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Meltdown, Spectre v2, and Speculative Store Bypass. The default behavior is unchanged. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> [will: reorder checks so KASLR implies KPTI and SSBS is affected by cmdline] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 29 4月, 2019 2 次提交
-
-
由 Sebastian Ott 提交于
Allow users to disable usage of MIO instructions by specifying pci=nomio at the kernel command line. Signed-off-by: NSebastian Ott <sebott@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
Provide a kernel parameter to force the usage of floating interrupts. Signed-off-by: NSebastian Ott <sebott@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 26 4月, 2019 1 次提交
-
-
由 Jeremy Linton 提交于
There are various reasons, such as benchmarking, to disable spectrev2 mitigation on a machine. Provide a command-line option to do so. Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: NAndre Przywara <andre.przywara@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Tested-by: NStefan Wahren <stefan.wahren@i2se.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 23 4月, 2019 1 次提交
-
-
由 Ryan Thibodeaux 提交于
Add a new command-line option "xen_timer_slop=<INT>" that sets the minimum delta of virtual Xen timers. This commit does not change the default timer slop value for virtual Xen timers. Lowering the timer slop value should improve the accuracy of virtual timers (e.g., better process dispatch latency), but it will likely increase the number of virtual timer interrupts (relative to the original slop setting). The original timer slop value has not changed since the introduction of the Xen-aware Linux kernel code. This commit provides users an opportunity to tune timer performance given the refinements to hardware and the Xen event channel processing. It also mirrors a feature in the Xen hypervisor - the "timer_slop" Xen command line option. [boris: updated comment describing TIMER_SLOP] Signed-off-by: NRyan Thibodeaux <ryan.thibodeaux@starlab.io> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 22 4月, 2019 1 次提交
-
-
由 Dave Young 提交于
crashkernel=xM tries to reserve memory for the crash kernel under 4G, which is enough, usually. But this could fail sometimes, for example when one tries to reserve a big chunk like 2G, for example. So let the crashkernel=xM just fall back to use high memory in case it fails to find a suitable low range. Do not set the ,high as default because it allocates extra low memory for DMA buffers and swiotlb, and this is not always necessary for all machines. Typically, crashkernel=128M usually works with low reservation under 4G, so keep <4G as default. [ bp: Massage. ] Signed-off-by: NDave Young <dyoung@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NIngo Molnar <mingo@kernel.org> Acked-by: NBaoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: linux-doc@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: piliu@redhat.com Cc: Ram Pai <linuxram@us.ibm.com> Cc: Sinan Kaya <okaya@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thymo van Beers <thymovanbeers@gmail.com> Cc: vgoyal@redhat.com Cc: x86-ml <x86@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Zhimin Gu <kookoo.gu@intel.com> Link: https://lkml.kernel.org/r/20190422031905.GA8387@dhcp-128-65.nay.redhat.com
-
- 21 4月, 2019 2 次提交
-
-
由 Christophe Leroy 提交于
This patch implements a framework for Kernel Userspace Access Protection. Then subarches will have the possibility to provide their own implementation by providing setup_kuap() and allow/prevent_user_access(). Some platforms will need to know the area accessed and whether it is accessed from read, write or both. Therefore source, destination and size and handed over to the two functions. mpe: Rename to allow/prevent rather than unlock/lock, and add read/write wrappers. Drop the 32-bit code for now until we have an implementation for it. Add kuap to pt_regs for 64-bit as well as 32-bit. Don't split strings, use pr_crit_ratelimited(). Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Christophe Leroy 提交于
This patch adds a skeleton for Kernel Userspace Execution Prevention. Then subarches implementing it have to define CONFIG_PPC_HAVE_KUEP and provide setup_kuep() function. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> [mpe: Don't split strings, use pr_crit_ratelimited()] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 18 4月, 2019 5 次提交
-
-
由 Josh Poimboeuf 提交于
Add MDS to the new 'mitigations=' cmdline option. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Josh Poimboeuf 提交于
Configure s390 runtime CPU speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Spectre v1 and Spectre v2. The default behavior is unchanged. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86) Reviewed-by: NJiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Price <steven.price@arm.com> Cc: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/e4a161805458a5ec88812aac0307ae3908a030fc.1555085500.git.jpoimboe@redhat.com
-
由 Josh Poimboeuf 提交于
Configure powerpc CPU runtime speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Meltdown, Spectre v1, Spectre v2, and Speculative Store Bypass. The default behavior is unchanged. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86) Reviewed-by: NJiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Price <steven.price@arm.com> Cc: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/245a606e1a42a558a310220312d9b6adb9159df6.1555085500.git.jpoimboe@redhat.com
-
由 Josh Poimboeuf 提交于
Configure x86 runtime CPU speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Meltdown, Spectre v2, Speculative Store Bypass, and L1TF. The default behavior is unchanged. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86) Reviewed-by: NJiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Price <steven.price@arm.com> Cc: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/6616d0ae169308516cfdf5216bedd169f8a8291b.1555085500.git.jpoimboe@redhat.com
-
由 Josh Poimboeuf 提交于
Keeping track of the number of mitigations for all the CPU speculation bugs has become overwhelming for many users. It's getting more and more complicated to decide which mitigations are needed for a given architecture. Complicating matters is the fact that each arch tends to have its own custom way to mitigate the same vulnerability. Most users fall into a few basic categories: a) they want all mitigations off; b) they want all reasonable mitigations on, with SMT enabled even if it's vulnerable; or c) they want all reasonable mitigations on, with SMT disabled if vulnerable. Define a set of curated, arch-independent options, each of which is an aggregation of existing options: - mitigations=off: Disable all mitigations. - mitigations=auto: [default] Enable all the default mitigations, but leave SMT enabled, even if it's vulnerable. - mitigations=auto,nosmt: Enable all the default mitigations, disabling SMT if needed by a mitigation. Currently, these options are placeholders which don't actually do anything. They will be fleshed out in upcoming patches. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86) Reviewed-by: NJiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Price <steven.price@arm.com> Cc: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/b07a8ef9b7c5055c3a4637c87d07c296d5016fe0.1555085500.git.jpoimboe@redhat.com
-
- 11 4月, 2019 1 次提交
-
-
由 Petr Vorel 提交于
Signed-off-by: NPetr Vorel <pvorel@suse.cz> Signed-off-by: NMimi Zohar <zohar@linux.ibm.com>
-
- 03 4月, 2019 1 次提交
-
-
由 Josh Poimboeuf 提交于
Add the mds=full,nosmt cmdline option. This is like mds=full, but with SMT disabled if the CPU is vulnerable. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NTyler Hicks <tyhicks@canonical.com> Acked-by: NJiri Kosina <jkosina@suse.cz>
-
- 27 3月, 2019 1 次提交
-
-
由 Paul E. McKenney 提交于
Currently, the rcu_nocbs= kernel boot parameter requires that a specific list of CPUs be specified, and has no way to say "all of them". As noted by user RavFX in a comment to Phoronix topic 1002538, this is an inconvenient side effect of the removal of the RCU_NOCB_CPU_ALL Kconfig option. This commit therefore enables the rcu_nocbs= kernel boot parameter to be given the string "all", as in "rcu_nocbs=all" to specify that all CPUs on the system are to have their RCU callbacks offloaded. Another approach would be to make cpulist_parse() check for "all", but there are uses of cpulist_parse() that do other checking, which could conflict with an "all". This commit therefore focuses on the specific use of cpulist_parse() in rcu_nocb_setup(). Just a note to other people who would like changes to Linux-kernel RCU: If you send your requests to me directly, they might get fixed somewhat faster. RavFX's comment was posted on January 22, 2018 and I first saw it on March 5, 2019. And the only reason that I found it -at- -all- was that I was looking for projects using RCU, and my search engine showed me that Phoronix comment quite by accident. Your choice, though! ;-) Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
-
- 22 3月, 2019 1 次提交
-
-
由 Juri Lelli 提交于
Clocksource watchdog has been found responsible for generating latency spikes (in the 10-20 us range) when woken up to check for TSC stability. Add an option to disable it at boot. Signed-off-by: NJuri Lelli <juri.lelli@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: bigeasy@linutronix.de Cc: linux-rt-users@vger.kernel.org Cc: peterz@infradead.org Cc: bristot@redhat.com Cc: williams@redhat.com Link: https://lkml.kernel.org/r/20190307120913.13168-1-juri.lelli@redhat.com
-
- 07 3月, 2019 2 次提交
-
-
由 Thomas Gleixner 提交于
Add the initial MDS vulnerability documentation. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NJon Masters <jcm@redhat.com>
-
由 Thomas Gleixner 提交于
Move L!TF to a separate directory so the MDS stuff can be added at the side. Otherwise the all hardware vulnerabilites have their own top level entry. Should have done that right away. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NJon Masters <jcm@redhat.com>
-