- 07 11月, 2018 1 次提交
-
-
由 Kai-Heng Feng 提交于
Devices connected under Terminus Technology Inc. Hub (1a40:0101) may fail to work after the system resumes from suspend: [ 206.063325] usb 3-2.4: reset full-speed USB device number 4 using xhci_hcd [ 206.143691] usb 3-2.4: device descriptor read/64, error -32 [ 206.351671] usb 3-2.4: device descriptor read/64, error -32 Info for this hub: T: Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 2 Spd=480 MxCh= 4 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1a40 ProdID=0101 Rev=01.11 S: Product=USB 2.0 Hub C: #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub Some expirements indicate that the USB devices connected to the hub are innocent, it's the hub itself is to blame. The hub needs extra delay time after it resets its port. Hence wait for extra delay, if the device is connected to this quirky hub. Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com> Cc: stable <stable@vger.kernel.org> Acked-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 27 10月, 2018 1 次提交
-
-
由 Alexander Duyck 提交于
Patch series "Address issues slowing persistent memory initialization", v5. The main thing this patch set achieves is that it allows us to initialize each node worth of persistent memory independently. As a result we reduce page init time by about 2 minutes because instead of taking 30 to 40 seconds per node and going through each node one at a time, we process all 4 nodes in parallel in the case of a 12TB persistent memory setup spread evenly over 4 nodes. This patch (of 3): On systems with a large amount of memory it can take a significant amount of time to initialize all of the page structs with the PAGE_POISON_PATTERN value. I have seen it take over 2 minutes to initialize a system with over 12TB of RAM. In order to work around the issue I had to disable CONFIG_DEBUG_VM and then the boot time returned to something much more reasonable as the arch_add_memory call completed in milliseconds versus seconds. However in doing that I had to disable all of the other VM debugging on the system. In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on a system that has a large amount of memory I have added a new kernel parameter named "vm_debug" that can be set to "-" in order to disable it. Link: http://lkml.kernel.org/r/20180925201921.3576.84239.stgit@localhost.localdomainReviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com> Signed-off-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 10月, 2018 1 次提交
-
-
由 Kees Cook 提交于
Booting with "lsm.debug" will report future details on how LSM ordering decisions are being made. Signed-off-by: NKees Cook <keescook@chromium.org> Reviewed-by: NCasey Schaufler <casey@schaufler-ca.com> Reviewed-by: NJohn Johansen <john.johansen@canonical.com> Reviewed-by: NJames Morris <james.morris@microsoft.com> Signed-off-by: NJames Morris <james.morris@microsoft.com>
-
- 09 10月, 2018 1 次提交
-
-
由 Yi Sun 提交于
Implement the required wait and kick callbacks to support PV spinlocks in Hyper-V guests. [ tglx: Document the requirement for disabling interrupts in the wait() callback. Remove goto and unnecessary includes. Add prototype for hv_vcpu_is_preempted(). Adapted to pending paravirt changes. ] Signed-off-by: NYi Sun <yi.y.sun@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NJuergen Gross <jgross@suse.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com> Cc: chao.p.peng@intel.com Cc: chao.gao@intel.com Cc: isaku.yamahata@intel.com Cc: tianyu.lan@microsoft.com Link: https://lkml.kernel.org/r/1538987374-51217-3-git-send-email-yi.y.sun@linux.intel.com
-
- 03 10月, 2018 3 次提交
-
-
由 Christophe Leroy 提交于
Add call to early_memtest() so that kernel compiled with CONFIG_MEMTEST really perform memtest at startup when requested via 'memtest' boot parameter. Tested-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Zeng Tao 提交于
The new scheme is required just to support legacy low and full-speed devices. For high speed devices, it will slower the enumeration speed. So in this patch we try the "old" enumeration scheme first for high speed devices, and this is what Windows does since Windows 8. Signed-off-by: NZeng Tao <prime.zeng@hisilicon.com> Acked-by: NAlan Stern <stern@rowland.harvard.edu> Reviewed-by: NRoger Quadros <rogerq@ti.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Feng Tang 提交于
The "pciserial" earlyprintk variant helps much on many modern x86 platforms, but unfortunately there are still some platforms with PCI UART devices which have the wrong PCI class code. In that case, the current class code check does not allow for them to be used for logging. Add a sub-option "force" which overrides the class code check and thus the use of such device can be enforced. [ bp: massage formulations. ] Suggested-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NFeng Tang <feng.tang@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Stuart R . Anderson" <stuart.r.anderson@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H Peter Anvin <hpa@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thymo van Beers <thymovanbeers@gmail.com> Cc: alan@linux.intel.com Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/20181002164921.25833-1-feng.tang@intel.com
-
- 02 10月, 2018 1 次提交
-
-
由 Andi Kleen 提交于
Implements counter freezing for Arch Perfmon v4 (Skylake and newer). This allows to speed up the PMI handler by avoiding unnecessary MSR writes and make it more accurate. The Arch Perfmon v4 PMI handler is substantially different than the older PMI handler. Differences to the old handler: - It relies on counter freezing, which eliminates several MSR writes from the PMI handler and lowers the overhead significantly. It makes the PMI handler more accurate, as all counters get frozen atomically as soon as any counter overflows. So there is much less counting of the PMI handler itself. With the freezing we don't need to disable or enable counters or PEBS. Only BTS which does not support auto-freezing still needs to be explicitly managed. - The PMU acking is done at the end, not the beginning. This makes it possible to avoid manual enabling/disabling of the PMU, instead we just rely on the freezing/acking. - The APIC is acked before reenabling the PMU, which avoids problems with LBRs occasionally not getting unfreezed on Skylake. - Looping is only needed to workaround a corner case which several PMIs are very close to each other. For common cases, the counters are freezed during PMI handler. It doesn't need to do re-check. This patch: - Adds code to enable v4 counter freezing - Fork <=v3 and >=v4 PMI handlers into separate functions. - Add kernel parameter to disable counter freezing. It took some time to debug counter freezing, so in case there are new problems we added an option to turn it off. Would not expect this to be used until there are new bugs. - Only for big core. The patch for small core will be posted later separately. Performance: When profiling a kernel build on Kabylake with different perf options, measuring the length of all NMI handlers using the nmi handler trace point: V3 is without counter freezing. V4 is with counter freezing. The value is the average cost of the PMI handler. (lower is better) perf options ` V3(ns) V4(ns) delta -c 100000 1088 894 -18% -g -c 100000 1862 1646 -12% --call-graph lbr -c 100000 3649 3367 -8% --c.g. dwarf -c 100000 2248 1982 -12% Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/1533712328-2834-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 10月, 2018 1 次提交
-
-
由 Zhen Lei 提交于
Add a generic command line option to enable lazy unmapping via IOVA flush queues, which will initally be suuported by iommu-dma. This echoes the semantics of "intel_iommu=strict" (albeit with the opposite default value), but in the driver-agnostic fashion of "iommu.passthrough". Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> [rm: move handling out of SMMUv3 driver, clean up documentation] Signed-off-by: NRobin Murphy <robin.murphy@arm.com> [will: dropped broken printk when parsing command-line option] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 21 9月, 2018 1 次提交
-
-
由 Robin Murphy 提交于
Document that the default for "iommu.passthrough" is now configurable. Fixes: 58d11317 ("iommu: Add config option to set passthrough as default") Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 14 9月, 2018 1 次提交
-
-
Scrubbing pages on initial balloon down can take some time, especially in nested virtualization case (nested EPT is slow). When HVM/PVH guest is started with memory= significantly lower than maxmem=, all the extra pages will be scrubbed before returning to Xen. But since most of them weren't used at all at that point, Xen needs to populate them first (from populate-on-demand pool). In nested virt case (Xen inside KVM) this slows down the guest boot by 15-30s with just 1.5GB needed to be returned to Xen. Add runtime parameter to enable/disable it, to allow initially disabling scrubbing, then enable it back during boot (for example in initramfs). Such usage relies on assumption that a) most pages ballooned out during initial boot weren't used at all, and b) even if they were, very few secrets are in the guest at that time (before any serious userspace kicks in). Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also enabled by default), controlling default value for the new runtime switch. Signed-off-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 02 9月, 2018 1 次提交
-
-
由 Kees Cook 提交于
Instead of forcing a distro or other system builder to choose at build time whether the CPU is trusted for CRNG seeding via CONFIG_RANDOM_TRUST_CPU, provide a boot-time parameter for end users to control the choice. The CONFIG will set the default state instead. Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 31 8月, 2018 3 次提交
-
-
由 Paul E. McKenney 提交于
The jiffies_till_sched_qs value used to determine how old a grace period must be before RCU enlists the help of the scheduler to force a quiescent state on the holdout CPU. Currently, this defaults to HZ/10 regardless of system size and may be set only at boot time. This can be a problem for very large systems, because if the values of the jiffies_till_first_fqs and jiffies_till_next_fqs kernel parameters are left at their defaults, they are calculated to increase as the number of CPUs actually configured on the system increases. Thus, on a sufficiently large system, RCU would enlist the help of the scheduler before the grace-period kthread had a chance to scan for idle CPUs, which wastes CPU time. This commit therefore allows jiffies_till_sched_qs to be set, if desired, but if left as default, computes is as jiffies_till_first_fqs plus twice jiffies_till_next_fqs, thus allowing three force-quiescent-state scans for idle CPUs. This scales with the number of CPUs, providing sensible default values. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Now that the RCU-bh and RCU-sched update-side functions are simple wrappers around their RCU counterparts, there isn't a whole lot of point in testing them. This commit therefore removes the self-test capability and removes the corresponding kernel-boot parameters. It also updates the various rcutorture .boot files to remove the kernel boot parameters that call for testing RCU-bh and RCU-sched. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The RCU-bh update API is now defined in terms of that of RCU-bh and RCU-sched, so this commit updates the documentation accordingly. In addition, although RCU-sched persists in !PREEMPT kernels, in the PREEMPT case its update API is now defined in terms of that of RCU-preempt, so this commit also updates the documentation accordingly. While in the area, this commit removes the documentation for the now-obsolete synchronize_rcu_mult() and clarifies the Tasks RCU documentation. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 23 8月, 2018 1 次提交
-
-
由 Kees Cook 提交于
The Kconfig text for CONFIG_PAGE_POISONING doesn't mention that it has to be enabled explicitly. This updates the documentation for that and adds a note about CONFIG_PAGE_POISONING to the "page_poison" command line docs. While here, change description of CONFIG_PAGE_POISONING_ZERO too, as it's not "random" data, but rather the fixed debugging value that would be used when not zeroing. Additionally removes a stray "bool" in the Kconfig. Link: http://lkml.kernel.org/r/20180725223832.GA43733@beastSigned-off-by: NKees Cook <keescook@chromium.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Laura Abbott <labbott@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 8月, 2018 3 次提交
-
-
由 Logan Gunthorpe 提交于
To support peer-to-peer traffic on a segment of the PCI hierarchy, we must disable the ACS redirect bits for select PCI bridges. The bridges must be selected before the devices are discovered by the kernel and the IOMMU groups created. Therefore, add a kernel command line parameter to specify devices which must have their ACS bits disabled. The new parameter takes a list of devices separated by a semicolon. Each device specified will have its ACS redirect bits disabled. This is similar to the existing 'resource_alignment' parameter. The ACS Request P2P Request Redirect, P2P Completion Redirect and P2P Egress Control bits are disabled, which is sufficient to always allow passing P2P traffic uninterrupted. The bits are set after the kernel (optionally) enables the ACS bits itself. It is also done regardless of whether the kernel or platform firmware sets the bits. If the user tries to disable the ACS redirect for a device without the ACS capability, print a warning to dmesg. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> [bhelgaas: reorder to add the generic code first and move the device-specific quirk to subsequent patches] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NStephen Bates <sbates@raithlin.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Acked-by: NChristian König <christian.koenig@amd.com>
-
由 Logan Gunthorpe 提交于
When specifying PCI devices on the kernel command line using a bus/device/function address, bus numbers can change when adding or replacing a device, changing motherboard firmware, or applying kernel parameters like "pci=assign-buses". When bus numbers change, it's likely the command line tweak will be applied to the wrong device. Therefore, it is useful to be able to specify devices with a base bus number and the path of devfns needed to get to it, similar to the "device scope" structure in the Intel VT-d spec, Section 8.3.1. Thus, we add an option to specify devices in the following format: [<domain>:]<bus>:<device>.<func>[/<device>.<func>]* The path can be any segment within the PCI hierarchy of any length and determined through the use of 'lspci -t'. When specified this way, it is less likely that a renumbered bus will result in a valid device specification and the tweak won't be applied to the wrong device. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> [bhelgaas: use "device" instead of "slot" in documentation since that's the usual language in the PCI specs] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NStephen Bates <sbates@raithlin.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Acked-by: NChristian König <christian.koenig@amd.com>
-
由 Logan Gunthorpe 提交于
Separate out the code to match a PCI device with a string (typically originating from a kernel parameter) from the pci_specified_resource_alignment() function into its own helper function. While we are at it, this change fixes the kernel style of the function (fixing a number of long lines and extra parentheses). Additionally, make the analogous change to the kernel parameter documentation: Separate the description of how to specify a PCI device into its own section at the head of the "pci=" parameter. This patch should have no functional alterations. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> [bhelgaas: use "device" instead of "slot" in documentation since that's the usual language in the PCI specs] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NStephen Bates <sbates@raithlin.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Acked-by: NChristian König <christian.koenig@amd.com>
-
- 07 8月, 2018 1 次提交
-
-
由 Diana Craciun 提交于
Currently only supported on powerpc. Signed-off-by: NDiana Craciun <diana.craciun@nxp.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 7月, 2018 1 次提交
-
-
由 Olof Johansson 提交于
This allows the default behavior to be controlled by a kernel config option instead of changing the commandline for the kernel to include "iommu.passthrough=on" or "iommu=pt" on machines where this is desired. Likewise, for machines where this config option is enabled, it can be disabled at boot time with "iommu.passthrough=off" or "iommu=nopt". Also corrected iommu=pt documentation for IA-64, since it has no code that parses iommu= at all. Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 20 7月, 2018 1 次提交
-
-
由 Pavel Tatashin 提交于
Currently, the notsc kernel parameter disables the use of the TSC by sched_clock(). However, this parameter does not prevent the kernel from accessing tsc in other places. The only rationale to boot with notsc is to avoid timing discrepancies on multi-socket systems where TSC are not properly synchronized, and thus exclude TSC from being used for time keeping. But that prevents using TSC as sched_clock() as well, which is not necessary as the core sched_clock() implementation can handle non synchronized TSC based sched clocks just fine. However, there is another method to solve the above problem: booting with tsc=unstable parameter. This parameter allows sched_clock() to use TSC and just excludes it from timekeeping. So there is no real reason to keep notsc, but for compatibility reasons the parameter has to stay. Make it behave like 'tsc=unstable' instead. [ tglx: Massaged changelog ] Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDou Liyang <douly.fnst@cn.fujitsu.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: peterz@infradead.org Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-12-pasha.tatashin@oracle.com
-
- 18 7月, 2018 1 次提交
-
-
由 Tobin C. Harding 提交于
Currently printing [hashed] pointers requires enough entropy to be available. Early in the boot sequence this may not be the case resulting in a dummy string '(____ptrval____)' being printed. This makes debugging the early boot sequence difficult. We can relax the requirement to use cryptographically secure hashing during debugging. This enables debugging while keeping development/production kernel behaviour the same. If new command line option debug_boot_weak_hash is enabled use cryptographically insecure hashing and hash pointer value immediately. Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NTobin C. Harding <me@tobin.cc> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 13 7月, 2018 2 次提交
-
-
由 Jiri Kosina 提交于
Introduce the 'l1tf=' kernel command line option to allow for boot-time switching of mitigation that is used on processors affected by L1TF. The possible values are: full Provides all available mitigations for the L1TF vulnerability. Disables SMT and enables all mitigations in the hypervisors. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. full,force Same as 'full', but disables SMT control. Implies the 'nosmt=force' command line option. sysfs control of SMT and the hypervisor flush control is disabled. flush Leaves SMT enabled and enables the conditional hypervisor mitigation. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. flush,nosmt Disables SMT and enables the conditional hypervisor mitigation. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. If SMT is reenabled or flushing disabled at runtime hypervisors will issue a warning. flush,nowarn Same as 'flush', but hypervisors will not warn when a VM is started in a potentially insecure configuration. off Disables hypervisor mitigations and doesn't emit any warnings. Default is 'flush'. Let KVM adhere to these semantics, which means: - 'lt1f=full,force' : Performe L1D flushes. No runtime control possible. - 'l1tf=full' - 'l1tf-flush' - 'l1tf=flush,nosmt' : Perform L1D flushes and warn on VM start if SMT has been runtime enabled or L1D flushing has been run-time enabled - 'l1tf=flush,nowarn' : Perform L1D flushes and no warnings are emitted. - 'l1tf=off' : L1D flushes are not performed and no warnings are emitted. KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush' module parameter except when lt1f=full,force is set. This makes KVM's private 'nosmt' option redundant, and as it is a bit non-systematic anyway (this is something to control globally, not on hypervisor level), remove that option. Add the missing Documentation entry for the l1tf vulnerability sysfs file while at it. Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NJiri Kosina <jkosina@suse.cz> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de
-
由 Paul E. McKenney 提交于
Some RCU bugs have been sensitive to the frequency of CPU-hotplug operations, which have been gradually increased over time. But this frequency is now at the one-second lower limit that can be specified using the rcutorture.onoff_interval kernel parameter. This commit therefore changes the units of rcutorture.onoff_interval from seconds to jiffies, and also sets the value specified for this kernel parameter in the TREE03 rcutorture scenario to 200, which is 200 milliseconds for HZ=1000. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 11 7月, 2018 2 次提交
-
-
由 Michael Ellerman 提交于
Document the support for spec_store_bypass_disable that was added for powerpc in commit a048a07d ("powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit"). Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Reviewed-by: NKees Cook <keescook@chromium.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Laurentiu Tudor 提交于
This parameter introduced several years ago in the XHCI host controller driver was somehow left undocumented. Add a few lines in the kernel parameters text. Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 10 7月, 2018 1 次提交
-
-
由 Rob Herring 提交于
Deferred probe will currently wait forever on dependent devices to probe, but sometimes a driver will never exist. It's also not always critical for a driver to exist. Platforms can rely on default configuration from the bootloader or reset defaults for things such as pinctrl and power domains. This is often the case with initial platform support until various drivers get enabled. There's at least 2 scenarios where deferred probe can render a platform broken. Both involve using a DT which has more devices and dependencies than the kernel supports. The 1st case is a driver may be disabled in the kernel config. The 2nd case is the kernel version may simply not have the dependent driver. This can happen if using a newer DT (provided by firmware perhaps) with a stable kernel version. Deferred probe issues can be difficult to debug especially if the console has dependencies or userspace fails to boot to a shell. There are also cases like IOMMUs where only built-in drivers are supported, so deferring probe after initcalls is not needed. The IOMMU subsystem implemented its own mechanism to handle this using OF_DECLARE linker sections. This commit adds makes ending deferred probe conditional on initcalls being completed or a debug timeout. Subsystems or drivers may opt-in by calling driver_deferred_probe_check_init_done() instead of unconditionally returning -EPROBE_DEFER. They may use additional information from DT or kernel's config to decide whether to continue to defer probe or not. The timeout mechanism is intended for debug purposes and WARNs loudly. The remaining deferred probe pending list will also be dumped after the timeout. Not that this timeout won't work for the console which needs to be enabled before userspace starts. However, if the console's dependencies are resolved, then the kernel log will be printed (as opposed to no output). Cc: Alexander Graf <agraf@suse.de> Signed-off-by: NRob Herring <robh@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 06 7月, 2018 1 次提交
-
-
由 Laurentiu Tudor 提交于
This parameter introduced several years ago in the XHCI host controller driver was somehow left undocumented. Add a few lines in the kernel parameters text. Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 05 7月, 2018 2 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
Add a mitigation mode parameter "vmentry_l1d_flush" for CVE-2018-3620, aka L1 terminal fault. The valid arguments are: - "always" L1D cache flush on every VMENTER. - "cond" Conditional L1D cache flush, explained below - "never" Disable the L1D cache flush mitigation "cond" is trying to avoid L1D cache flushes on VMENTER if the code executed between VMEXIT and VMENTER is considered safe, i.e. is not bringing any interesting information into L1D which might exploited. [ tglx: Split out from a larger patch ] Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Konrad Rzeszutek Wilk 提交于
If the L1TF CPU bug is present we allow the KVM module to be loaded as the major of users that use Linux and KVM have trusted guests and do not want a broken setup. Cloud vendors are the ones that are uncomfortable with CVE 2018-3620 and as such they are the ones that should set nosmt to one. Setting 'nosmt' means that the system administrator also needs to disable SMT (Hyper-threading) in the BIOS, or via the 'nosmt' command line parameter, or via the /sys/devices/system/cpu/smt/control. See commit 05736e4a ("cpu/hotplug: Provide knobs to control SMT"). Other mitigations are to use task affinity, cpu sets, interrupt binding, etc - anything to make sure that _only_ the same guests vCPUs are running on sibling threads. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 04 7月, 2018 1 次提交
-
-
由 Chris von Recklinghausen 提交于
Enabling HARDENED_USERCOPY may cause measurable regressions in networking performance: up to 8% under UDP flood. I ran a small packet UDP flood using pktgen vs. a host b2b connected. On the receiver side the UDP packets are processed by a simple user space process that just reads and drops them: https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c Not very useful from a functional PoV, but it helps to pin-point bottlenecks in the networking stack. When running a kernel with CONFIG_HARDENED_USERCOPY=y, I see a 5-8% regression in the receive tput, compared to the same kernel without this option enabled. With CONFIG_HARDENED_USERCOPY=y, perf shows ~6% of CPU time spent cumulatively in __check_object_size (~4%) and __virt_addr_valid (~2%). The call-chain is: __GI___libc_recvfrom entry_SYSCALL_64_after_hwframe do_syscall_64 __x64_sys_recvfrom __sys_recvfrom inet_recvmsg udp_recvmsg __check_object_size udp_recvmsg() actually calls copy_to_iter() (inlined) and the latters calls check_copy_size() (again, inlined). A generic distro may want to enable HARDENED_USERCOPY in their default kernel config, but at the same time, such distro may want to be able to avoid the performance penalties in with the default configuration and disable the stricter check on a per-boot basis. This change adds a boot parameter that conditionally disables HARDENED_USERCOPY via "hardened_usercopy=off". Signed-off-by: NChris von Recklinghausen <crecklin@redhat.com> Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 02 7月, 2018 1 次提交
-
-
由 Thomas Gleixner 提交于
Dave Hansen reported, that it's outright dangerous to keep SMT siblings disabled completely so they are stuck in the BIOS and wait for SIPI. The reason is that Machine Check Exceptions are broadcasted to siblings and the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or reboots the machine. The MCE chapter in the SDM contains the following blurb: Because the logical processors within a physical package are tightly coupled with respect to shared hardware resources, both logical processors are notified of machine check errors that occur within a given physical processor. If machine-check exceptions are enabled when a fatal error is reported, all the logical processors within a physical package are dispatched to the machine-check exception handler. If machine-check exceptions are disabled, the logical processors enter the shutdown state and assert the IERR# signal. When enabling machine-check exceptions, the MCE flag in control register CR4 should be set for each logical processor. Reverting the commit which ignores siblings at enumeration time solves only half of the problem. The core cpuhotplug logic needs to be adjusted as well. This thoughtful engineered mechanism also turns the boot process on all Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU before the secondary CPUs are brought up. Depending on the number of physical cores the window in which this situation can happen is smaller or larger. On a HSW-EX it's about 750ms: MCE is enabled on the boot CPU: [ 0.244017] mce: CPU supports 22 MCE banks The corresponding sibling #72 boots: [ 1.008005] .... node #0, CPUs: #72 That means if an MCE hits on physical core 0 (logical CPUs 0 and 72) between these two points the machine is going to shutdown. At least it's a known safe state. It's obvious that the early boot can be hit by an MCE as well and then runs into the same situation because MCEs are not yet enabled on the boot CPU. But after enabling them on the boot CPU, it does not make any sense to prevent the kernel from recovering. Adjust the nosmt kernel parameter documentation as well. Reverts: 2207def7 ("x86/apic: Ignore secondary threads if nosmt=force") Reported-by: NDave Hansen <dave.hansen@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NTony Luck <tony.luck@intel.com>
-
- 30 6月, 2018 1 次提交
-
-
由 Sinan Kaya 提交于
Move early dump functionality into common code so that it is available for all architectures. No need to carry arch-specific reads around as the read hooks are already initialized by the time pci_setup_device() is getting called during scan. Tested-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NSinan Kaya <okaya@codeaurora.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
-
- 21 6月, 2018 1 次提交
-
-
由 Thomas Gleixner 提交于
Provide a command line and a sysfs knob to control SMT. The command line options are: 'nosmt': Enumerate secondary threads, but do not online them 'nosmt=force': Ignore secondary threads completely during enumeration via MP table and ACPI/MADT. The sysfs control file has the following states (read/write): 'on': SMT is enabled. Secondary threads can be freely onlined 'off': SMT is disabled. Secondary threads, even if enumerated cannot be onlined 'forceoff': SMT is permanentely disabled. Writes to the control file are rejected. 'notsupported': SMT is not supported by the CPU The command line option 'nosmt' sets the sysfs control to 'off'. This can be changed to 'on' to reenable SMT during runtime. The command line option 'nosmt=force' sets the sysfs control to 'forceoff'. This cannot be changed during runtime. When SMT is 'on' and the control file is changed to 'off' then all online secondary threads are offlined and attempts to online a secondary thread later on are rejected. When SMT is 'off' and the control file is changed to 'on' then secondary threads can be onlined again. The 'off' -> 'on' transition does not automatically online the secondary threads. When the control file is set to 'forceoff', the behaviour is the same as setting it to 'off', but the operation is irreversible and later writes to the control file are rejected. When the control status is 'notsupported' then writes to the control file are rejected. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: NIngo Molnar <mingo@kernel.org>
-
- 16 6月, 2018 2 次提交
-
-
由 Mauro Carvalho Chehab 提交于
The alsa parameters file was renamed to alsa-configuration.rst. With regards to OSS, it got retired as a hole by at changeset 727dede0 ("sound: Retire OSS"). So, it doesn't make sense to keep mentioning it at kernel-parameters.txt. Fixes: 727dede0 ("sound: Retire OSS") Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: NJonathan Corbet <corbet@lwn.net>
-
由 Mauro Carvalho Chehab 提交于
As we move stuff around, some doc references are broken. Fix some of them via this script: ./scripts/documentation-file-ref-check --fix Manually checked if the produced result is valid, removing a few false-positives. Acked-by: NTakashi Iwai <tiwai@suse.de> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Acked-by: NStephen Boyd <sboyd@kernel.org> Acked-by: NCharles Keepax <ckeepax@opensource.wolfsonmicro.com> Acked-by: NMathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: NColy Li <colyli@suse.de> Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: NJonathan Corbet <corbet@lwn.net>
-
- 01 6月, 2018 1 次提交
-
-
由 Marc Zyngier 提交于
On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: NJulien Grall <julien.grall@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 29 5月, 2018 1 次提交
-
-
由 Omar Sandoval 提交于
This parameter has been around since commit e162b39a ("softlockup: decouple hung tasks check from softlockup detection") in 2009 but was never documented. Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 28 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Limiting the dma mask to avoid PCI (pre-PCIe) DAC cycles while paying the huge overhead of an IOMMU is rather pointless, and this seriously gets in the way of dma mapping work. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-