提交 · 11ea68f553e244851d15793a7fa33a97c46d8271 · openeuler / Kernel

22 1月, 2020 1 次提交

genirq, sched/isolation: Isolate from handling managed interrupts · 11ea68f5

由 Ming Lei 提交于 1月 20, 2020

The affinity of managed interrupts is completely handled in the kernel and
cannot be changed via the /proc/irq/* interfaces from user space. As the
kernel tries to spread out interrupts evenly accross CPUs on x86 to prevent
vector exhaustion, it can happen that a managed interrupt whose affinity
mask contains both isolated and housekeeping CPUs is routed to an isolated
CPU. As a consequence IO submitted on a housekeeping CPU causes interrupts
on the isolated CPU.

Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding
logic in the interrupt affinity selection code.

The subparameter indicates to the interrupt affinity selection logic that
it should try to avoid the above scenario.

This isolation is best effort and only effective if the automatically
assigned interrupt mask of a device queue contains isolated and
housekeeping CPUs. If housekeeping CPUs are online then such interrupts are
directed to the housekeeping CPU so that IO submitted on the housekeeping
CPU cannot disturb the isolated CPU.

If a queue's affinity mask contains only isolated CPUs then this parameter
has no effect on the interrupt routing decision, though interrupts are only
happening when tasks running on those isolated CPUs submit IO. IO submitted
on housekeeping CPUs has no influence on those queues.

If the affinity mask contains both housekeeping and isolated CPUs, but none
of the contained housekeeping CPUs is online, then the interrupt is also
routed to an isolated CPU. Interrupts are only delivered when one of the
isolated CPUs in the affinity mask submits IO. If one of the contained
housekeeping CPUs comes online, the CPU hotplug logic migrates the
interrupt automatically back to the upcoming housekeeping CPU. Depending on
the type of interrupt controller, this can require that at least one
interrupt is delivered to the isolated CPU in order to complete the
migration.

[ tglx: Removed unused parameter, added and edited comments/documentation
and rephrased the changelog so it contains more details. ]
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200120091625.17912-1-ming.lei@redhat.com

11ea68f5

23 11月, 2019 1 次提交

Documentation: Remove bootmem_debug from kernel-parameters.txt · e3fedd57

由 Masami Hiramatsu 提交于 11月 22, 2019

Remove bootmem_debug kernel paramenter because it has been
replaced by memblock=debug.
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/157443061745.20995.9432492850513217966.stgit@devnote2Signed-off-by: NJonathan Corbet <corbet@lwn.net>

e3fedd57

19 11月, 2019 1 次提交

ACPI: sysfs: Change ACPI_MASKABLE_GPE_MAX to 0x100 · a7583e72

由 Yunfeng Ye 提交于 11月 14, 2019

The commit 0f27cff8 ("ACPI: sysfs: Make ACPI GPE mask kernel
parameter cover all GPEs") says:
  "Use a bitmap of size 0xFF instead of a u64 for the GPE mask so 256
   GPEs can be masked"

But the masking of GPE 0xFF it not supported and the check condition
"gpe > ACPI_MASKABLE_GPE_MAX" is not valid because the type of gpe is
u8.

So modify the macro ACPI_MASKABLE_GPE_MAX to 0x100, and drop the "gpe >
ACPI_MASKABLE_GPE_MAX" check. In addition, update the docs "Format" for
acpi_mask_gpe parameter.

Fixes: 0f27cff8 ("ACPI: sysfs: Make ACPI GPE mask kernel parameter cover all GPEs")
Signed-off-by: NYunfeng Ye <yeyunfeng@huawei.com>
[ rjw: Use u16 as gpe data type in acpi_gpe_apply_masked_gpes() ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a7583e72

18 11月, 2019 1 次提交

USB: documentation: flags on usb-storage versus UAS · 65cc8bf9

由 Oliver Neukum 提交于 11月 14, 2019

Document which flags work storage, UAS or both
Signed-off-by: NOliver Neukum <oneukum@suse.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191114112758.32747-4-oneukum@suse.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

65cc8bf9

16 11月, 2019 1 次提交

x86/speculation: Fix incorrect MDS/TAA mitigation status · 64870ed1

由 Waiman Long 提交于 11月 15, 2019

For MDS vulnerable processors with TSX support, enabling either MDS or
TAA mitigations will enable the use of VERW to flush internal processor
buffers at the right code path. IOW, they are either both mitigated
or both not. However, if the command line options are inconsistent,
the vulnerabilites sysfs files may not report the mitigation status
correctly.

For example, with only the "mds=off" option:

  vulnerabilities/mds:Vulnerable; SMT vulnerable
  vulnerabilities/tsx_async_abort:Mitigation: Clear CPU buffers; SMT vulnerable

The mds vulnerabilities file has wrong status in this case. Similarly,
the taa vulnerability file will be wrong with mds mitigation on, but
taa off.

Change taa_select_mitigation() to sync up the two mitigation status
and have them turned off if both "mds=off" and "tsx_async_abort=off"
are present.

Update documentation to emphasize the fact that both "mds=off" and
"tsx_async_abort=off" have to be specified together for processors that
are affected by both TAA and MDS to be effective.

 [ bp: Massage and add kernel-parameters.txt change too. ]

Fixes: 1b42f017 ("x86/speculation/taa: Add mitigation for TSX Async Abort")
Signed-off-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: linux-doc@vger.kernel.org
Cc: Mark Gross <mgross@linux.intel.com>
Cc: <stable@vger.kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191115161445.30809-2-longman@redhat.com

64870ed1

13 11月, 2019 1 次提交

Revert "Documentation: admin-guide: add earlycon documentation for RISC-V" · 14d3fe42

由 Jonathan Corbet 提交于 11月 12, 2019

This reverts commit 7f70ae56.

Christoph H. notes that the information is redundant, and Paul W. agrees
with reverting.
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

14d3fe42

07 11月, 2019 2 次提交

x86/efi: Add efi_fake_mem support for EFI_MEMORY_SP · 199c8471

由 Dan Williams 提交于 11月 06, 2019

Given that EFI_MEMORY_SP is platform BIOS policy decision for marking
memory ranges as "reserved for a specific purpose" there will inevitably
be scenarios where the BIOS omits the attribute in situations where it
is desired. Unlike other attributes if the OS wants to reserve this
memory from the kernel the reservation needs to happen early in init. So
early, in fact, that it needs to happen before e820__memblock_setup()
which is a pre-requisite for efi_fake_memmap() that wants to allocate
memory for the updated table.

Introduce an x86 specific efi_fake_memmap_early() that can search for
attempts to set EFI_MEMORY_SP via efi_fake_mem and update the e820 table
accordingly.

The KASLR code that scans the command line looking for user-directed
memory reservations also needs to be updated to consider
"efi_fake_mem=nn@ss:0x40000" requests.
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

199c8471

efi: Common enable/disable infrastructure for EFI soft reservation · b617c526

由 Dan Williams 提交于 11月 06, 2019

UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
interpretation of the EFI Memory Types as "reserved for a specific
purpose".

The proposed Linux behavior for specific purpose memory is that it is
reserved for direct-access (device-dax) by default and not available for
any kernel usage, not even as an OOM fallback.  Later, through udev
scripts or another init mechanism, these device-dax claimed ranges can
be reconfigured and hot-added to the available System-RAM with a unique
node identifier. This device-dax management scheme implements "soft" in
the "soft reserved" designation by allowing some or all of the
reservation to be recovered as typical memory. This policy can be
disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with
efi=nosoftreserve.

As for this patch, define the common helpers to determine if the
EFI_MEMORY_SP attribute should be honored. The determination needs to be
made early to prevent the kernel from being loaded into soft-reserved
memory, or otherwise allowing early allocations to land there. Follow-on
changes are needed per architecture to leverage these helpers in their
respective mem-init paths.
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b617c526

05 11月, 2019 1 次提交

kvm: x86: mmu: Recovery of shattered NX large pages · 1aa9b957

由 Junaid Shahid 提交于 11月 04, 2019

The page table pages corresponding to broken down large pages are zapped in
FIFO order, so that the large page can potentially be recovered, if it is
not longer being used for execution.  This removes the performance penalty
for walking deeper EPT page tables.

By default, one large page will last about one hour once the guest
reaches a steady state.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

1aa9b957

04 11月, 2019 1 次提交

kvm: mmu: ITLB_MULTIHIT mitigation · b8e8c830

由 Paolo Bonzini 提交于 11月 04, 2019

With some Intel processors, putting the same virtual address in the TLB
as both a 4 KiB and 2 MiB page can confuse the instruction fetch unit
and cause the processor to issue a machine check resulting in a CPU lockup.

Unfortunately when EPT page tables use huge pages, it is possible for a
malicious guest to cause this situation.

Add a knob to mark huge pages as non-executable. When the nx_huge_pages
parameter is enabled (and we are using EPT), all huge pages are marked as
NX. If the guest attempts to execute in one of those pages, the page is
broken down into 4K pages, which are then marked executable.

This is not an issue for shadow paging (except nested EPT), because then
the host is in control of TLB flushes and the problematic situation cannot
happen.  With nested EPT, again the nested guest can cause problems shadow
and direct EPT is treated in the same way.

[ tglx: Fixup default to auto and massage wording a bit ]
Originally-by: NJunaid Shahid <junaids@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

b8e8c830

28 10月, 2019 3 次提交

x86/speculation/taa: Add documentation for TSX Async Abort · a7a248c5

由 Pawan Gupta 提交于 10月 23, 2019

Add the documenation for TSX Async Abort. Include the description of
the issue, how to check the mitigation state, control the mitigation,
guidance for system administrators.

 [ bp: Add proper SPDX tags, touch ups by Josh and me. ]
Co-developed-by: NAntonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: NAntonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NMark Gross <mgross@linux.intel.com>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>

a7a248c5

x86/tsx: Add "auto" option to the tsx= cmdline parameter · 7531a359

由 Pawan Gupta 提交于 10月 23, 2019

Platforms which are not affected by X86_BUG_TAA may want the TSX feature
enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto
disable TSX when X86_BUG_TAA is present, otherwise enable TSX.

More details on X86_BUG_TAA can be found here:
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

 [ bp: Extend the arg buffer to accommodate "auto\0". ]
Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>

7531a359

x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default · 95c5824f

由 Pawan Gupta 提交于 10月 23, 2019

Add a kernel cmdline parameter "tsx" to control the Transactional
Synchronization Extensions (TSX) feature. On CPUs that support TSX
control, use "tsx=on|off" to enable or disable TSX. Not specifying this
option is equivalent to "tsx=off". This is because on certain processors
TSX may be used as a part of a speculative side channel attack.

Carve out the TSX controlling functionality into a separate compilation
unit because TSX is a CPU feature while the TSX async abort control
machinery will go to cpu/bugs.c.

 [ bp: - Massage, shorten and clear the arg buffer.
       - Clarifications of the tsx= possible options - Josh.
       - Expand on TSX_CTRL availability - Pawan. ]
Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>

95c5824f

26 10月, 2019 1 次提交

PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control · 35a0b237

由 Olof Johansson 提交于 10月 23, 2019

Prior to eed85ff4 ("PCI/DPC: Enable DPC only if AER is available"),
Linux handled DPC events regardless of whether firmware had granted it
ownership of AER or DPC, e.g., via _OSC.

PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
control of AER, so after eed85ff4, Linux handles DPC events only if it
has control of AER.

On platforms that do not grant OS control of AER via _OSC, Linux DPC
handling worked before eed85ff4 but not after.

To make Linux DPC handling work on those platforms the same way they did
before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
handle DPC events regardless of whether it has control of AER.

[bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.netSigned-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

35a0b237

23 10月, 2019 1 次提交

PCI: Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters · d7b8a217

由 Nicholas Johnson 提交于 10月 23, 2019

The existing "pci=hpmemsize=nn[KMG]" kernel parameter overrides the default
size of both the non-prefetchable and the prefetchable MMIO windows for
hotplug bridges.

Add "pci=hpmmiosize=nn[KMG]" to override the default size of only the
non-prefetchable MMIO window.

Add "pci=hpmmioprefsize=nn[KMG]" to override the default size of only the
prefetchable MMIO window.

Link: https://lore.kernel.org/r/SL2P216MB0187E4D0055791957B7E2660806B0@SL2P216MB0187.KORP216.PROD.OUTLOOK.COMSigned-off-by: NNicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>

d7b8a217

22 10月, 2019 1 次提交

arm64: Retrieve stolen time as paravirtualized guest · e0685fa2

由 Steven Price 提交于 10月 21, 2019

Enable paravirtualization features when running under a hypervisor
supporting the PV_TIME_ST hypercall.

For each (v)CPU, we ask the hypervisor for the location of a shared
page which the hypervisor will use to report stolen time to us. We set
pv_time_ops to the stolen time function which simply reads the stolen
value from the shared page for a VCPU. We guarantee single-copy
atomicity using READ_ONCE which means we can also read the stolen
time for another VCPU than the currently running one while it is
potentially being updated by the hypervisor.
Signed-off-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>

e0685fa2

16 10月, 2019 1 次提交

serial: fsl_linflexuart: Be consistent with the name · 9905f32a

由 Stefan-Gabriel Mirea 提交于 10月 16, 2019

For consistency reasons, spell the controller name as "LINFlexD" in
comments and documentation.
Signed-off-by: NStefan-Gabriel Mirea <stefan-gabriel.mirea@nxp.com>
Link: https://lore.kernel.org/r/1571230107-8493-4-git-send-email-stefan-gabriel.mirea@nxp.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9905f32a

11 10月, 2019 1 次提交

Documentation: admin-guide: add earlycon documentation for RISC-V · 7f70ae56

由 Paul Walmsley 提交于 10月 09, 2019

Kernels booting on RISC-V can specify "earlycon" with no options on
the Linux command line, and the generic DT earlycon support will query
the "chosen/stdout-path" property (if present) to determine which
early console device to use.  Document this appropriately in the
admin-guide.
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>
Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andreas Schwab <schwab@suse.de>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

7f70ae56

08 10月, 2019 1 次提交

x86/xen: Return from panic notifier · c6875f3a

由 Boris Ostrovsky 提交于 9月 30, 2019

Currently execution of panic() continues until Xen's panic notifier
(xen_panic_event()) is called at which point we make a hypercall that
never returns.

This means that any notifier that is supposed to be called later as
well as significant part of panic() code (such as pstore writes from
kmsg_dump()) is never executed.

There is no reason for xen_panic_event() to be this last point in
execution since panic()'s emergency_restart() will call into
xen_emergency_restart() from where we can perform our hypercall.

Nevertheless, we will provide xen_legacy_crash boot option that will
preserve original behavior during crash. This option could be used,
for example, if running kernel dumper (which happens after panic
notifiers) is undesirable.
Reported-by: NJames Dingwall <james@dingwall.me.uk>
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>

c6875f3a

04 10月, 2019 1 次提交

of: property: Add functional dependency link from DT bindings · a3e1d1a7

由 Saravana Kannan 提交于 9月 04, 2019

Add device links after the devices are created (but before they are
probed) by looking at common DT bindings like clocks and
interconnects.

Automatically adding device links for functional dependencies at the
framework level provides the following benefits:

- Optimizes device probe order and avoids the useless work of
attempting probes of devices that will not probe successfully
(because their suppliers aren't present or haven't probed yet).

For example, in a commonly available mobile SoC, registering just
one consumer device's driver at an initcall level earlier than the
supplier device's driver causes 11 failed probe attempts before the
consumer device probes successfully. This was with a kernel with all
the drivers statically compiled in. This problem gets a lot worse if
all the drivers are loaded as modules without direct symbol
dependencies.

- Supplier devices like clock providers, interconnect providers, etc
need to keep the resources they provide active and at a particular
state(s) during boot up even if their current set of consumers don't
request the resource to be active. This is because the rest of the
consumers might not have probed yet and turning off the resource
before all the consumers have probed could lead to a hang or
undesired user experience.

Some frameworks (Eg: regulator) handle this today by turning off
"unused" resources at late_initcall_sync and hoping all the devices
have probed by then. This is not a valid assumption for systems with
loadable modules. Other frameworks (Eg: clock) just don't handle
this due to the lack of a clear signal for when they can turn off
resources. This leads to downstream hacks to handle cases like this
that can easily be solved in the upstream kernel.

By linking devices before they are probed, we give suppliers a clear
count of the number of dependent consumers. Once all of the
consumers are active, the suppliers can turn off the unused
resources without making assumptions about the number of consumers.

By default we just add device-links to track "driver presence" (probe
succeeded) of the supplier device. If any other functionality provided
by device-links are needed, it is left to the consumer/supplier
devices to change the link when they probe.

kbuild test robot reported clang error about missing const
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NSaravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20190904211126.47518-4-saravanak@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a3e1d1a7

01 10月, 2019 1 次提交

Documentation: document earlycon without options for more platforms · e18409c0

由 Christoph Hellwig 提交于 9月 17, 2019

The earlycon options without arguments is supposed to work on all
device tree platforms, not just arm64.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

e18409c0

25 9月, 2019 1 次提交

mm, page_owner, debug_pagealloc: save and dump freeing stack trace · 8974558f

由 Vlastimil Babka 提交于 9月 23, 2019

The debug_pagealloc functionality is useful to catch buggy page allocator
users that cause e.g.  use after free or double free.  When page
inconsistency is detected, debugging is often simpler by knowing the call
stack of process that last allocated and freed the page.  When page_owner
is also enabled, we record the allocation stack trace, but not freeing.

This patch therefore adds recording of freeing process stack trace to page
owner info, if both page_owner and debug_pagealloc are configured and
enabled.  With only page_owner enabled, this info is not useful for the
memory leak debugging use case.  dump_page() is adjusted to print the
info.  An example result of calling __free_pages() twice may look like
this (note the page last free stack trace):

BUG: Bad page state in process bash  pfn:13d8f8
page:ffffc31984f63e00 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x1affff800000000()
raw: 01affff800000000 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
page dumped because: nonzero _refcount
page_owner tracks the page as freed
page last allocated via order 0, migratetype Unmovable, gfp_mask 0xcc0(GFP_KERNEL)
 prep_new_page+0x143/0x150
 get_page_from_freelist+0x289/0x380
 __alloc_pages_nodemask+0x13c/0x2d0
 khugepaged+0x6e/0xc10
 kthread+0xf9/0x130
 ret_from_fork+0x3a/0x50
page last free stack trace:
 free_pcp_prepare+0x134/0x1e0
 free_unref_page+0x18/0x90
 khugepaged+0x7b/0xc10
 kthread+0xf9/0x130
 ret_from_fork+0x3a/0x50
Modules linked in:
CPU: 3 PID: 271 Comm: bash Not tainted 5.3.0-rc4-2.g07a1a73-default+ #57
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
Call Trace:
 dump_stack+0x85/0xc0
 bad_page.cold+0xba/0xbf
 rmqueue_pcplist.isra.0+0x6c5/0x6d0
 rmqueue+0x2d/0x810
 get_page_from_freelist+0x191/0x380
 __alloc_pages_nodemask+0x13c/0x2d0
 __get_free_pages+0xd/0x30
 __pud_alloc+0x2c/0x110
 copy_page_range+0x4f9/0x630
 dup_mmap+0x362/0x480
 dup_mm+0x68/0x110
 copy_process+0x19e1/0x1b40
 _do_fork+0x73/0x310
 __x64_sys_clone+0x75/0x80
 do_syscall_64+0x6e/0x1e0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f10af854a10
...

Link: http://lkml.kernel.org/r/20190820131828.22684-5-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8974558f

14 9月, 2019 1 次提交

Documentation: Add "earlycon=sbi" to the admin guide · 82f12ab3

由 Palmer Dabbelt 提交于 9月 13, 2019

This argument is supported on RISC-V systems and widely used, but was
not documented here.
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

82f12ab3

11 9月, 2019 1 次提交

iommu/vt-d: Check whether device requires bounce buffer · e5e04d05

由 Lu Baolu 提交于 9月 06, 2019

This adds a helper to check whether a device needs to
use bounce buffer. It also provides a boot time option
to disable the bounce buffer. Users can use this to
prevent the iommu driver from using the bounce buffer
for performance gain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NXu Pengfei <pengfei.xu@intel.com>
Tested-by: NMika Westerberg <mika.westerberg@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e5e04d05

05 9月, 2019 1 次提交

powerpc/64s/radix: introduce options to disable use of the tlbie instruction · 2275d7b5

由 Nicholas Piggin 提交于 9月 03, 2019

Introduce two options to control the use of the tlbie instruction. A
boot time option which completely disables the kernel using the
instruction, this is currently incompatible with HASH MMU, KVM, and
coherent accelerators.

And a debugfs option can be switched at runtime and avoids using tlbie
for invalidating CPU TLBs for normal process and kernel address
mappings. Coherent accelerators are still managed with tlbie, as will
KVM partition scope translations.

Cross-CPU TLB flushing is implemented with IPIs and tlbiel. This is a
basic implementation which does not attempt to make any optimisation
beyond the tlbie implementation.

This is useful for performance testing among other things. For example
in certain situations on large systems, using IPIs may be faster than
tlbie as they can be directed rather than broadcast. Later we may also
take advantage of the IPIs to do more interesting things such as trim
the mm cpumask more aggressively.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190902152931.17840-7-npiggin@gmail.com

2275d7b5

04 9月, 2019 1 次提交

tty: serial: Add linflexuart driver for S32V234 · 09864c1c

由 Stefan-gabriel Mirea 提交于 8月 09, 2019

Introduce support for LINFlex driver, based on:
- the version of Freescale LPUART driver after commit b3e3bf2e ("Merge
  4.0-rc7 into tty-next");
- commit abf1e0a9 ("tty: serial: fsl_lpuart: lock port on console
  write").
In this basic version, the driver can be tested using initramfs and relies
on the clocks and pin muxing set up by U-Boot.

Remarks concerning the earlycon support:

- LinFlexD does not allow character transmissions in the INIT mode (see
  section 47.4.2.1 in the reference manual[1]). Therefore, a mutual
  exclusion between the first linflex_setup_watermark/linflex_set_termios
  executions and linflex_earlycon_putchar was employed and the characters
  normally sent to earlycon during initialization are kept in a buffer and
  sent afterwards.

- Empirically, character transmission is also forbidden within the last 1-2
  ms before entering the INIT mode, so we use an explicit timeout
  (PREINIT_DELAY) between linflex_earlycon_putchar and the first call to
  linflex_setup_watermark.

- U-Boot currently uses the UART FIFO mode, while this driver makes the
  transition to the buffer mode. Therefore, the earlycon putchar function
  matches the U-Boot behavior before initializations and the Linux behavior
  after.

[1] https://www.nxp.com/webapp/Download?colCode=S32V234RMSigned-off-by: NStoica Cosmin-Stefan <cosmin.stoica@nxp.com>
Signed-off-by: NAdrian.Nitu <adrian.nitu@freescale.com>
Signed-off-by: NLarisa Grigore <Larisa.Grigore@nxp.com>
Signed-off-by: NAna Nedelcu <B56683@freescale.com>
Signed-off-by: NMihaela Martinas <Mihaela.Martinas@freescale.com>
Signed-off-by: NMatthew Nunez <matthew.nunez@nxp.com>
[stefan-gabriel.mirea@nxp.com: Reduced for upstreaming and implemented
                               earlycon support]
Signed-off-by: NStefan-Gabriel Mirea <stefan-gabriel.mirea@nxp.com>
Link: https://lore.kernel.org/r/20190809112853.15846-6-stefan-gabriel.mirea@nxp.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

09864c1c

03 9月, 2019 1 次提交

block: elevator.c: Remove now unused elevator= argument · 85c0a037

由 Marcos Paulo de Souza 提交于 8月 27, 2019

Since the inclusion of blk-mq, elevator argument was not being
considered anymore, and it's utility died long with the legacy IO path,
now removed too.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NBob Liu <bob.liu@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMarcos Paulo de Souza <marcos.souza.org@gmail.com>

Fold with doc removal patch.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

85c0a037

30 8月, 2019 1 次提交

powerpc/prom_init: Add the ESM call to prom_init · 6a9c930b

由 Ram Pai 提交于 8月 19, 2019

Make the Enter-Secure-Mode (ESM) ultravisor call to switch the VM to secure
mode. Pass kernel base address and FDT address so that the Ultravisor is
able to verify the integrity of the VM using information from the ESM blob.

Add "svm=" command line option to turn on switching to secure mode.
Signed-off-by: NRam Pai <linuxram@us.ibm.com>
[ andmike: Generate an RTAS os-term hcall when the ESM ucall fails. ]
Signed-off-by: NMichael Anderson <andmike@linux.ibm.com>
[ bauerman: Cleaned up the code a bit. ]
Signed-off-by: NThiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820021326.6884-5-bauerman@linux.ibm.com

6a9c930b

28 8月, 2019 1 次提交

Revert "of/platform: Add functional dependency link from DT bindings" · d77b3f07

由 Greg Kroah-Hartman 提交于 8月 27, 2019

This reverts commit 690ff788.

Based on a lot of email and in-person discussions, this patch series is
being reworked to address a number of issues that were pointed out that
needed to be taken care of before it should be merged.  It will be
resubmitted with those changes hopefully soon.

Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Saravana Kannan <saravanak@google.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d77b3f07

23 8月, 2019 1 次提交
- J
  Documentation: Update Documentation for iommu.passthrough · c8fb436b
  由 Joerg Roedel 提交于 8月 19, 2019
```
This kernel parameter now takes also effect on X86.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
```
  c8fb436b
22 8月, 2019 1 次提交

powerpc: Document xmon options · 6278f55b

由 Gustavo Romero 提交于 8月 14, 2019

Document all options currently supported by xmon debugger.
Signed-off-by: NGustavo Romero <gromero@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190814205638.25322-1-gromero@linux.ibm.com

6278f55b

20 8月, 2019 2 次提交

security: Add a static lockdown policy LSM · 000d388e

由 Matthew Garrett 提交于 8月 19, 2019

While existing LSMs can be extended to handle lockdown policy,
distributions generally want to be able to apply a straightforward
static policy. This patch adds a simple LSM that can be configured to
reject either integrity or all lockdown queries, and can be configured
at runtime (through securityfs), boot time (via a kernel parameter) or
build time (via a kconfig option). Based on initial code by David
Howells.
Signed-off-by: NMatthew Garrett <mjg59@google.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

000d388e

x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h · c49a0a80

由 Tom Lendacky 提交于 8月 19, 2019

There have been reports of RDRAND issues after resuming from suspend on
some AMD family 15h and family 16h systems. This issue stems from a BIOS
not performing the proper steps during resume to ensure RDRAND continues
to function properly.

RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
support using CPUID, including the kernel, will believe that RDRAND is
not supported.

Update the CPU initialization to clear the RDRAND CPUID bit for any family
15h and 16h processor that supports RDRAND. If it is known that the family
15h or family 16h system does not have an RDRAND resume issue or that the
system will not be placed in suspend, the "rdrand=force" kernel parameter
can be used to stop the clearing of the RDRAND CPUID bit.

Additionally, update the suspend and resume path to save and restore the
MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
place after resuming from suspend.

Note, that clearing the RDRAND CPUID bit does not prevent a processor
that normally supports the RDRAND instruction from executing it. So any
code that determined the support based on family and model won't #UD.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com

c49a0a80

17 8月, 2019 1 次提交

ia64: remove the zx1 swiotlb machvec · df43acac

由 Christoph Hellwig 提交于 8月 13, 2019

The aim of this machvec is to support devices with < 32-bit dma
masks. But given that ia64 only has a ZONE_DMA32 and not a ZONE_DMA
that isn't supported by swiotlb either.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lkml.kernel.org/r/20190813072514.23299-21-hch@lst.deSigned-off-by: NTony Luck <tony.luck@intel.com>

df43acac

14 8月, 2019 1 次提交

rcu/nocb: Rename rcu_nocb_leader_stride kernel boot parameter · f7c612b0

由 Paul E. McKenney 提交于 4月 02, 2019

This commit changes the name of the rcu_nocb_leader_stride kernel
boot parameter to rcu_nocb_gp_stride in order to account for the new
distinction between callback and grace-period no-CBs kthreads.
Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>

f7c612b0

09 8月, 2019 1 次提交

PCI: Correct pci=resource_alignment parameter example · 3b1b1ce3

由 Alexey Kardashevskiy 提交于 6月 06, 2019

The "pci=resource_alignment" parameter is described as requiring an order
(not a size) and the code in pci_specified_resource_alignment() expects an
order.

But the example wrongly shows a size.  Convert the example to an order.

Fixes: 8b078c60 ("PCI: Update "pci=resource_alignment" documentation")
Link: https://lore.kernel.org/r/20190606032557.107542-1-aik@ozlabs.ruSigned-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

3b1b1ce3

02 8月, 2019 1 次提交

rcu: Add kernel parameter to dump trace after RCU CPU stall warning · cdc694b2

由 Paul E. McKenney 提交于 6月 13, 2019

This commit adds a rcu_cpu_stall_ftrace_dump kernel boot parameter, that,
when set, causes the trace buffer to be dumped after an RCU CPU stall
warning is printed. This kernel boot parameter is disabled by default,
maintaining compatibility with previous behavior.
Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>

cdc694b2

01 8月, 2019 1 次提交

of/platform: Add functional dependency link from DT bindings · 690ff788

由 Saravana Kannan 提交于 7月 31, 2019

Add device-links after the devices are created (but before they are
probed) by looking at common DT bindings like clocks and
interconnects.

Automatically adding device-links for functional dependencies at the
framework level provides the following benefits:

- Optimizes device probe order and avoids the useless work of
attempting probes of devices that will not probe successfully
(because their suppliers aren't present or haven't probed yet).

kbuild test robot reported clang error about missing const
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NSaravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20190731221721.187713-4-saravanak@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

690ff788

24 7月, 2019 1 次提交

Documentation: move Documentation/virtual to Documentation/virt · 2f5947df

由 Christoph Hellwig 提交于 7月 24, 2019

Renaming docs seems to be en vogue at the moment, so fix on of the
grossly misnamed directories.  We usually never use "virtual" as
a shortcut for virtualization in the kernel, but always virt,
as seen in the virt/ top-level directory.  Fix up the documentation
to match that.

Fixes: ed16648e ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2f5947df

17 7月, 2019 1 次提交

xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete · b39b0497

由 Zhenzhong Duan 提交于 7月 11, 2019

Clean up unnecessory code after that operation.
Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: NJuergen Gross <jgross@suse.com>

b39b0497

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功