提交 · 535b2f73f6f60fb227b700136c134c5d7c8f8ad3 · OpenHarmony / kernel_linux

13 11月, 2016 1 次提交

x86/efi: Retrieve and assign Apple device properties · 58c5475a

由 Lukas Wunner 提交于 11月 12, 2016

Apple's EFI drivers supply device properties which are needed to support
Macs optimally. They contain vital information which cannot be obtained
any other way (e.g. Thunderbolt Device ROM). They're also used to convey
the current device state so that OS drivers can pick up where EFI
drivers left (e.g. GPU mode setting).

There's an EFI driver dubbed "AAPL,PathProperties" which implements a
per-device key/value store. Other EFI drivers populate it using a custom
protocol. The macOS bootloader /System/Library/CoreServices/boot.efi
retrieves the properties with the same protocol. The kernel extension
AppleACPIPlatform.kext subsequently merges them into the I/O Kit
registry (see ioreg(8)) where they can be queried by other kernel
extensions and user space.

This commit extends the efistub to retrieve the device properties before
ExitBootServices is called. It assigns them to devices in an fs_initcall
so that they can be queried with the API in <linux/property.h>.

Note that the device properties will only be available if the kernel is
booted with the efistub. Distros should adjust their installers to
always use the efistub on Macs. grub with the "linux" directive will not
work unless the functionality of this commit is duplicated in grub.
(The "linuxefi" directive should work but is not included upstream as of
this writing.)

The custom protocol has GUID 91BD12FE-F6C3-44FB-A5B7-5122AB303AE0 and
looks like this:

typedef struct {
	unsigned long version; /* 0x10000 */
	efi_status_t (*get) (
		IN	struct apple_properties_protocol *this,
		IN	struct efi_dev_path *device,
		IN	efi_char16_t *property_name,
		OUT	void *buffer,
		IN OUT	u32 *buffer_len);
		/* EFI_SUCCESS, EFI_NOT_FOUND, EFI_BUFFER_TOO_SMALL */
	efi_status_t (*set) (
		IN	struct apple_properties_protocol *this,
		IN	struct efi_dev_path *device,
		IN	efi_char16_t *property_name,
		IN	void *property_value,
		IN	u32 property_value_len);
		/* allocates copies of property name and value */
		/* EFI_SUCCESS, EFI_OUT_OF_RESOURCES */
	efi_status_t (*del) (
		IN	struct apple_properties_protocol *this,
		IN	struct efi_dev_path *device,
		IN	efi_char16_t *property_name);
		/* EFI_SUCCESS, EFI_NOT_FOUND */
	efi_status_t (*get_all) (
		IN	struct apple_properties_protocol *this,
		OUT	void *buffer,
		IN OUT	u32 *buffer_len);
		/* EFI_SUCCESS, EFI_BUFFER_TOO_SMALL */
} apple_properties_protocol;

Thanks to Pedro Vilaça for this blog post which was helpful in reverse
engineering Apple's EFI drivers and bootloader:
https://reverse.put.as/2016/06/25/apple-efi-firmware-passwords-and-the-scbo-myth/

If someone at Apple is reading this, please note there's a memory leak
in your implementation of the del() function as the property struct is
freed but the name and value allocations are not.

Neither the macOS bootloader nor Apple's EFI drivers check the protocol
version, but we do to avoid breakage if it's ever changed. It's been the
same since at least OS X 10.6 (2009).

The get_all() function conveniently fills a buffer with all properties
in marshalled form which can be passed to the kernel as a setup_data
payload. The number of device properties is dynamic and can change
between a first invocation of get_all() (to determine the buffer size)
and a second invocation (to retrieve the actual buffer), hence the
peculiar loop which does not finish until the buffer size settles.
The macOS bootloader does the same.

The setup_data payload is later on unmarshalled in an fs_initcall. The
idea is that most buses instantiate devices in "subsys" initcall level
and drivers are usually bound to these devices in "device" initcall
level, so we assign the properties in-between, i.e. in "fs" initcall
level.

This assumes that devices to which properties pertain are instantiated
from a "subsys" initcall or earlier. That should always be the case
since on macOS, AppleACPIPlatformExpert::matchEFIDevicePath() only
supports ACPI and PCI nodes and we've fully scanned those buses during
"subsys" initcall level.

The second assumption is that properties are only needed from a "device"
initcall or later. Seems reasonable to me, but should this ever not work
out, an alternative approach would be to store the property sets e.g. in
a btree early during boot. Then whenever device_add() is called, an EFI
Device Path would have to be constructed for the newly added device,
and looked up in the btree. That way, the property set could be assigned
to the device immediately on instantiation. And this would also work for
devices instantiated in a deferred fashion. It seems like this approach
would be more complicated and require more code. That doesn't seem
justified without a specific use case.

For comparison, the strategy on macOS is to assign properties to objects
in the ACPI namespace (AppleACPIPlatformExpert::mergeEFIProperties()).
That approach is definitely wrong as it fails for devices not present in
the namespace: The NHI EFI driver supplies properties for attached
Thunderbolt devices, yet on Macs with Thunderbolt 1 only one device
level behind the host controller is described in the namespace.
Consequently macOS cannot assign properties for chained devices. With
Thunderbolt 2 they started to describe three device levels behind host
controllers in the namespace but this grossly inflates the SSDT and
still fails if the user daisy-chained more than three devices.

We copy the property names and values from the setup_data payload to
swappable virtual memory and afterwards make the payload available to
the page allocator. This is just for the sake of good housekeeping, it
wouldn't occupy a meaningful amount of physical memory (4444 bytes on my
machine). Only the payload is freed, not the setup_data header since
otherwise we'd break the list linkage and we cannot safely update the
predecessor's ->next link because there's no locking for the list.

The payload is currently not passed on to kexec'ed kernels, same for PCI
ROMs retrieved by setup_efi_pci(). This can be added later if there is
demand by amending setup_efi_state(). The payload can then no longer be
made available to the page allocator of course.

Tested-by: Lukas Wunner <lukas@wunner.de> [MacBookPro9,1]
Tested-by: Pierre Moreau <pierre.morrow@free.fr> [MacBookPro11,3]
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pedro Vilaça <reverser@put.as>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: grub-devel@gnu.org
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20161112213237.8804-9-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>

58c5475a

26 10月, 2016 1 次提交

x86/dumpstack: Remove raw stack dump · 0ee1dd9f

由 Josh Poimboeuf 提交于 10月 25, 2016

For mostly historical reasons, the x86 oops dump shows the raw stack
values:

  ...
  [registers]
  Stack:
   ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0
   ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321
   0000000000000002 0000000000000000 0000000000000000 0000000000000000
  Call Trace:
  ...

This seems to be an artifact from long ago, and probably isn't needed
anymore.  It generally just adds noise to the dump, and it can be
actively harmful because it leaks kernel addresses.

Linus says:

  "The stack dump actually goes back to forever, and it used to be
   useful back in 1992 or so. But it used to be useful mainly because
   stacks were simpler and we didn't have very good call traces anyway. I
   definitely remember having used them - I just do not remember having
   used them in the last ten+ years.

   Of course, it's still true that if you can trigger an oops, you've
   likely already lost the security game, but since the stack dump is so
   useless, let's aim to just remove it and make games like the above
   harder."

This also removes the related 'kstack=' cmdline option and the
'kstack_depth_to_print' sysctl.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e83bd50df52d8fe88e94d2566426ae40d813bf8f.1477405374.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0ee1dd9f

25 10月, 2016 1 次提交

x86/cpu: Get rid of the show_msr= boot option · 59c6f278

由 Borislav Petkov 提交于 10月 24, 2016

It is useless as it dumps the MSRs too early BUT(!) we do set MSRs later too.
Also, it dumps only BSP MSRs as it gets called only for CPU 0.

And the MSR range array would need constant updating anyway, and so on
and so on...

Oh, and we have msr.ko and msr-tools which are the much better solution
anyway. So off it goes...
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161024173844.23038-4-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

59c6f278

12 10月, 2016 2 次提交

Input: i8042 - skip selftest on ASUS laptops · 930e1924

由 Marcos Paulo de Souza 提交于 10月 01, 2016

On suspend/resume cycle, selftest is executed to reset i8042 controller.
But when this is done in Asus devices, subsequent calls to detect/init
functions to elantech driver fails. Skipping selftest fixes this problem.

An easier step to reproduce this problem is adding i8042.reset=1 as a
kernel parameter. On Asus laptops, it'll make the system to start with the
touchpad already stuck, since psmouse_probe forcibly calls the selftest
function.

This patch was inspired by John Hiesey's change[1], but, since this problem
affects a lot of models of Asus, let's avoid running selftests on them.

All models affected by this problem:
A455LD
K401LB
K501LB
K501LX
R409L
V502LX
X302LA
X450LCP
X450LD
X455LAB
X455LDB
X455LF
Z450LA

[1]: https://marc.info/?l=linux-input&m=144312209020616&w=2

Fixes: "ETPS/2 Elantech Touchpad dies after resume from suspend"
(https://bugzilla.kernel.org/show_bug.cgi?id=107971)
Signed-off-by: NMarcos Paulo de Souza <marcos.souza.org@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

930e1924

lib/bitmap.c: enhance bitmap syntax · 2d13e6ca

由 Noam Camus 提交于 10月 11, 2016

Today there are platforms with many CPUs (up to 4K).  Trying to boot only
part of the CPUs may result in too long string.

For example lets take NPS platform that is part of arch/arc.  This
platform have SMP system with 256 cores each with 16 HW threads (SMT
machine) where HW thread appears as CPU to the kernel.  In this example
there is total of 4K CPUs.  When one tries to boot only part of the HW
threads from each core the string representing the map may be long...  For
example if for sake of performance we decided to boot only first half of
HW threads of each core the map will look like:
0-7,16-23,32-39,...,4080-4087

This patch introduce new syntax to accommodate with such use case.  I
added an optional postfix to a range of CPUs which will choose according
to given modulo the desired range of reminders i.e.:

    <cpus range>:sed_size/group_size

For example, above map can be described in new syntax like this:
0-4095:8/16

Note that this patch is backward compatible with current syntax.

[akpm@linux-foundation.org: rework documentation]
Link: http://lkml.kernel.org/r/1473579629-4283-1-git-send-email-noamca@mellanox.comSigned-off-by: NNoam Camus <noamca@mellanox.com>
Cc: David Decotigny <decot@googlers.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David S. Miller <davem@davemloft.net>
Cc: Pan Xinhui <xinhui@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2d13e6ca

27 9月, 2016 2 次提交

serial: xuartps: Add some register initialisation to cdns_early_console_setup() · c41251b1

由 Scott Telford 提交于 9月 22, 2016

Add initialisation of control register and baud rate to
cdns_early_console_setup(), required when running kernel standalone
without a boot loader. Baud rate is only initialised when specified in
earlycon command-line option, otherwise it is assumed this has been
set by a boot loader. Updated Documentation/kernel-parameters.txt
accordingly.
Signed-off-by: NScott Telford <stelford@cadence.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c41251b1

gpio/mockup: add virtual gpio device · 0f98dd1b

由 Bamvor Jian Zhang 提交于 8月 31, 2016

This patch add basic structure of a virtual gpio device(gpio-mockup)
for testing gpio subsystem. The tester could manipulate such device
through userspace(sysfs or char device) and check the result from
debugfs.

Currently, it support one or more gpiochip(determined by module
parameters with base,ngpio pair). One could test the overlap of
different gpiochip and test the direction and/or output values of
these chips.
Signed-off-by: NKamlakant Patel <kamlakant.patel@broadcom.com>
Signed-off-by: NBamvor Jian Zhang <bamvor.zhangjian@linaro.org>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

0f98dd1b

24 9月, 2016 1 次提交

arm64: arch_timer: Work around QorIQ Erratum A-008585 · f6dc1576

由 Scott Wood 提交于 9月 22, 2016

Erratum A-008585 says that the ARM generic timer counter "has the
potential to contain an erroneous value for a small number of core
clock cycles every time the timer value changes".  Accesses to TVAL
(both read and write) are also affected due to the implicit counter
read.  Accesses to CVAL are not affected.

The workaround is to reread TVAL and count registers until successive
reads return the same value.  Writes to TVAL are replaced with an
equivalent write to CVAL.

The workaround is to reread TVAL and count registers until successive reads
return the same value, and when writing TVAL to retry until counter
reads before and after the write return the same value.

The workaround is enabled if the fsl,erratum-a008585 property is found in
the timer node in the device tree.  This can be overridden with the
clocksource.arm_arch_timer.fsl-a008585 boot parameter, which allows KVM
users to enable the workaround until a mechanism is implemented to
automatically communicate this information.

This erratum can be found on LS1043A and LS2080A.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NScott Wood <oss@buserror.net>
[will: renamed read macro to reflect that it's not usually unstable]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f6dc1576

20 9月, 2016 1 次提交

NFSv4.x: Add kernel parameter to control the callback server · 5405fc44

由 Trond Myklebust 提交于 8月 29, 2016

Add support for the kernel parameter nfs.callback_nr_threads to set
the number of threads that will be assigned to the callback channel.

Add support for the kernel parameter nfs.nfs.max_session_cb_slots
to set the maximum size of the callback channel slot table.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5405fc44

13 9月, 2016 1 次提交

scsi: introduce a quirk for false cache reporting · 050bc4e8

由 Oliver Neukum 提交于 9月 12, 2016

Some SATA to USB bridges fail to cooperate with some
drives resulting in no cache being present being reported
to the host. That causes the host to skip sending
a command to synchronize caches. That causes data loss
when the drive is powered down.
Signed-off-by: NOliver Neukum <oneukum@suse.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

050bc4e8

09 9月, 2016 1 次提交

x86/pkeys: Default to a restrictive init PKRU · acd547b2

由 Dave Hansen 提交于 7月 29, 2016

PKRU is the register that lets you disallow writes or all access to a given
protection key.

The XSAVE hardware defines an "init state" of 0 for PKRU: its most
permissive state, allowing access/writes to everything.  Since we start off
all new processes with the init state, we start all processes off with the
most permissive possible PKRU.

This is unfortunate.  If a thread is clone()'d [1] before a program has
time to set PKRU to a restrictive value, that thread will be able to write
to all data, no matter what pkey is set on it.  This weakens any integrity
guarantees that we want pkeys to provide.

To fix this, we define a very restrictive PKRU to override the
XSAVE-provided value when we create a new FPU context.  We choose a value
that only allows access to pkey 0, which is as restrictive as we can
practically make it.

This does not cause any practical problems with applications using
protection keys because we require them to specify initial permissions for
each key when it is allocated, which override the restrictive default.

In the end, this ensures that threads which do not know how to manage their
own pkey rights can not do damage to data which is pkey-protected.

I would have thought this was a pretty contrived scenario, except that I
heard a bug report from an MPX user who was creating threads in some very
early code before main().  It may be crazy, but folks evidently _do_ it.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: linux-arch@vger.kernel.org
Cc: Dave Hansen <dave@sr71.net>
Cc: mgorman@techsingularity.net
Cc: arnd@arndb.de
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20160729163021.F3C25D4A@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

acd547b2

06 9月, 2016 1 次提交

documentation/scsi: Remove nodisconnect parameter · 2836a2f5

由 Finn Thain 提交于 8月 27, 2016

The driver that used the 'nodisconnect' parameter was removed in
commit 565bae6a ("[SCSI] 53c7xx: kill driver"). Related documentation
was cleaned up in commit f37a7238 ("[SCSI] 53c7xx: fix removal
fallout"), except for the remaining two mentions that are removed here.
Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
Reviewed-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

2836a2f5

05 9月, 2016 1 次提交

iommu/amd: Detect and enable guest vAPIC support · 3928aa3f

由 Suravee Suthikulpanit 提交于 8月 23, 2016

This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir,
which can be used to specify different interrupt remapping mode for
passthrough devices to VM guest:
    * legacy: Legacy interrupt remapping (w/ 32-bit IRTE)
    * vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit IRTE)

Note that in vapic mode, it can also supports legacy interrupt remapping
for non-passthrough devices with the 128-bit IRTE.
Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3928aa3f

31 8月, 2016 1 次提交

scsi: Documentation/scsi: Remove nodisconnect parameter · 35d0acba

由 Finn Thain 提交于 8月 27, 2016

The driver that used the 'nodisconnect' parameter was removed in commit
565bae6a ("[SCSI] 53c7xx: kill driver"). Related documentation was
cleaned up in commit f37a7238 ("[SCSI] 53c7xx: fix removal
fallout"), except for the remaining two mentions that are removed here.
Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
Reviewed-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

35d0acba

26 8月, 2016 1 次提交

docs: kernel-parameter: Improve the description of nr_cpus and maxcpus · 7c142bfe

由 Baoquan He 提交于 8月 24, 2016

From the old description people still can't get what's the exact
difference between nr_cpus and maxcpus. Especially in kdump kernel
nr_cpus is always suggested if it's implemented in the ARCH. The
reason is nr_cpus is used to limit the max number of possible cpu
in system, the sum of already plugged cpus and hot plug cpus can't
exceed its value. However maxcpus is used to limit how many cpus
are allowed to be brought up during bootup.
Signed-off-by: NBaoquan He <bhe@redhat.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

7c142bfe

19 8月, 2016 1 次提交

Update the maximum depth of C-state from 6 to 9 · 22c6bbe4

由 baolex.ni 提交于 7月 11, 2016

Hi Jon,

This patch is an old one, we have corrected some minor issues on the newer one.
Please only review the newest version from my last mail with this subject
"[PATCH] ACPI: Update the maximum depth of C-state from 6 to 9".
And I also attached it to this mail.

Thanks,
Baole

On 7/11/2016 6:37 AM, Jonathan Corbet wrote:
> On Mon, 4 Jul 2016 09:55:10 +0800
> "baolex.ni" <baolex.ni@intel.com> wrote:
>
>> Currently, CPUIDLE_STATE_MAX has been defined as 10 in the cpuidle head file,
>> and max_cstate = CPUIDLE_STATE_MAX – 1, so 9 is the right maximum depth of C-state.
>> This change is reflected in one place of the kernel-param file,
>> but not in the other place where I suggest changing.
>>
>> Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
>> Signed-off-by: Baole Ni <baolex.ni@intel.com>
>
> So why are there two signoffs on a single-line patch?  Which one of you
> is the actual author?
>
> Thanks,
>
> jon
>

From cf5f8aa6885874f6490b11507d3c0c86fa0a11f4 Mon Sep 17 00:00:00 2001
From: Chuansheng Liu <chuansheng.liu@intel.com>
Date: Mon, 4 Jul 2016 08:52:51 +0800
Subject: [PATCH] Update the maximum depth of C-state from 6 to 9
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently, CPUIDLE_STATE_MAX has been defined as 10 in the cpuidle head file,
and max_cstate = CPUIDLE_STATE_MAX – 1, so 9 is the right maximum depth of C-state.
This change is reflected in one place of the kernel-param file,
but not in the other place where I suggest changing.
Signed-off-by: NChuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: NBaole Ni <baolex.ni@intel.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

22c6bbe4

10 8月, 2016 1 次提交

PCI: Update "pci=resource_alignment" documentation · 8b078c60

由 Mathias Koehrer 提交于 8月 09, 2016

Some uio based PCI drivers, e.g., uio_cif, do not work if the assigned PCI
memory resources are not page aligned.  By using the kernel option
"pci=resource_alignment=<align>@<bus>:<slot>.<func>" it is possible to
request page alignment for memory resources of devices.

However, this is cumbersome when using several devices, and the
bus/slot/func addresses may change if devices are added to or removed from
the system.

Extend the "pci=resource_alignment" option so we can specify the relevant
devices via PCI vendor, device, subvendor, and subdevice IDs.  The
specification of the devices via IDs is indicated by a leading string
"pci:" as argument to "pci=resource_alignment".

The format of the specification is
  pci:<vendor>:<device>[:<subvendor>:<subdevice>]

Examples:
  pci=resource_alignment=4096@pci:8086:9c22:103c:198f
  pci=resource_alignment=pci:8086:9c22       # defaults to PAGE_SIZE align

[bhelgaas: changelog, use actual vendor/device IDs in examples]
Signed-off-by: NMathias Koehrer <mathias.koehrer@etas.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

8b078c60

04 8月, 2016 2 次提交

modules: Add kernel parameter to blacklist modules · be7de5f9

由 Prarit Bhargava 提交于 7月 21, 2016

Blacklisting a module in linux has long been a problem.  The current
procedure is to use rd.blacklist=module_name, however, that doesn't
cover the case after the initramfs and before a boot prompt (where one
is supposed to use /etc/modprobe.d/blacklist.conf to blacklist
runtime loading). Using rd.shell to get an early prompt is hit-or-miss,
and doesn't cover all situations AFAICT.

This patch adds this functionality of permanently blacklisting a module
by its name via the kernel parameter module_blacklist=module_name.

[v2]: Rusty, use core_param() instead of __setup() which simplifies
things.

[v3]: Rusty, undo wreckage from strsep()

[v4]: Rusty, simpler version of blacklisted()
Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-doc@vger.kernel.org
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

be7de5f9

Documenation: update cgroup's document path · 09c3bcce

由 seokhoon.yoon 提交于 8月 02, 2016

cgroup's document path is changed to "cgroup-v1". update it.
Signed-off-by: Nseokhoon.yoon <iamyooon@gmail.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

09c3bcce

03 8月, 2016 1 次提交

printk: add kernel parameter to control writes to /dev/kmsg · 750afe7b

由 Borislav Petkov 提交于 8月 02, 2016

Add a "printk.devkmsg" kernel command line parameter which controls how
userspace writes into /dev/kmsg.  It has three options:

 * ratelimit - ratelimit logging from userspace.
 * on  - unlimited logging from userspace
 * off - logging from userspace gets ignored

The default setting is to ratelimit the messages written to it.

This changes the kernel default setting of "on" to "ratelimit" and we do
that because we want to keep userspace spamming /dev/kmsg to sane
levels.  This is especially moot when a small kernel log buffer wraps
around and messages get lost.  So the ratelimiting setting should be a
sane setting where kernel messages should have a bit higher chance of
survival from all the spamming.

It additionally does not limit logging to /dev/kmsg while the system is
booting if we haven't disabled it on the command line.

Furthermore, we can control the logging from a lower priority sysctl
interface - kernel.printk_devkmsg.

That interface will succeed only if printk.devkmsg *hasn't* been
supplied on the command line.  If it has, then printk.devkmsg is a
one-time setting which remains for the duration of the system lifetime.
This "locking" of the setting is to prevent userspace from changing the
logging on us through sysctl(2).

This patch is based on previous patches from Linus and Steven.

[bp@suse.de: fixes]
  Link: http://lkml.kernel.org/r/20160719072344.GC25563@nazgul.tnic
Link: http://lkml.kernel.org/r/20160716061745.15795-3-bp@alien8.deSigned-off-by: NBorislav Petkov <bp@suse.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

750afe7b

26 7月, 2016 1 次提交

PCI: Allow additional bus numbers for hotplug bridges · e16b4660

由 Keith Busch 提交于 7月 21, 2016

A user may hot add a switch requiring more than one bus to enumerate.  This
previously required a system reboot if BIOS did not sufficiently pad the
bus resource, which they frequently don't do.

Add a kernel parameter so a user can specify the minimum number of bus
numbers to reserve for a hotplug bridge's subordinate buses so rebooting
won't be necessary.

The default is 1, which is equivalent to previous behavior.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

e16b4660

17 7月, 2016 1 次提交

powerpc/mm/radix: Add a kernel command line to disable radix · b275bfb2

由 Aneesh Kumar K.V 提交于 7月 13, 2016

This patch adds the kernel command line disable_radix which disable
the radix MMU mode even if firmware indicates radix support via
ibm,pa-features device tree node.

This helps in testing different MMU mode easily.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b275bfb2

14 7月, 2016 1 次提交

SUNRPC: Add a server side per-connection limit · ff3ac5c3

由 Trond Myklebust 提交于 6月 24, 2016

Allow the user to limit the number of requests serviced through a single
connection, to help prevent faster clients from starving slower clients.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ff3ac5c3

10 7月, 2016 1 次提交

PM / hibernate: Image data protection during restoration · 4c0b6c10

由 Rafael J. Wysocki 提交于 7月 10, 2016

Make it possible to protect all pages holding image data during
hibernate image restoration by setting them read-only (so as to
catch attempts to write to those pages after image data have been
stored in them).

This adds overhead to image restoration code (it may cause large
page mappings to be split as a result of page flags changes) and
the errors it protects against should never happen in theory, so
the feature is only active after passing hibernate=protect_image
to the command line of the restore kernel.

Also it only is built if CONFIG_DEBUG_RODATA is set.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4c0b6c10

09 7月, 2016 1 次提交

efi / ACPI: load SSTDs from EFI variables · 475fb4e8

由 Octavian Purdila 提交于 7月 08, 2016

This patch allows SSDTs to be loaded from EFI variables. It works by
specifying the EFI variable name containing the SSDT to be loaded. All
variables with the same name (regardless of the vendor GUID) will be
loaded.

Note that we can't use acpi_install_table and we must rely on the
dynamic ACPI table loading and bus re-scanning mechanisms. That is
because I2C/SPI controllers are initialized earlier then the EFI
subsystems and all I2C/SPI ACPI devices are enumerated when the
I2C/SPI controllers are initialized.
Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com>
Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

475fb4e8

05 7月, 2016 1 次提交

powerpc/mm: Add a parameter to disable 1TB segs · faf78829

由 Oliver O'Halloran 提交于 7月 05, 2016

This patch adds the kernel command line parameter "no_tb_segs" which
forces the kernel to use 256MB rather than 1TB segments. Forcing the use
of 256MB segments makes it considerably easier to test code that depends
on an SLB miss occurring.
Suggested-by: NMichael Neuling <mikey@neuling.org>
Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

faf78829

30 6月, 2016 1 次提交

ACPI / APEI: Add Boot Error Record Table (BERT) support · a3e2acc5

由 Huang Ying 提交于 6月 29, 2016

ACPI/APEI is designed to verifiy/report H/W errors, like Corrected
Error(CE) and Uncorrected Error(UC). It contains four tables: HEST,
ERST, EINJ and BERT. The first three tables have been merged for
a long time, but because of lacking BIOS support for BERT, the
support for BERT is pending until now. Recently on ARM 64 platform
it is has been supported. So here we come.

Under normal circumstances, when a hardware error occurs, kernel will
be notified via NMI, MCE or some other method, then kernel will
process the error condition, report it, and recover it if possible.
But sometime, the situation is so bad, so that firmware may choose to
reset directly without notifying Linux kernel.

Linux kernel can use the Boot Error Record Table (BERT) to get the
un-notified hardware errors that occurred in a previous boot. In this
patch, the error information is reported via printk.

For more information about BERT, please refer to ACPI Specification
version 6.0, section 18.3.1:
  http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf

The following log is a BERT record after system reboot because of hitting
a fatal memory error:
BERT: Error records from previous boot:
[Hardware Error]: It has been corrected by h/w and requires no further action
[Hardware Error]: event severity: corrected
[Hardware Error]:  Error 0, type: recoverable
[Hardware Error]:   section_type: memory error
[Hardware Error]:   error_status: 0x0000000000000400
[Hardware Error]:   physical_address: 0xffffffffffffffff
[Hardware Error]:   card: 1 module: 2 bank: 3 row: 1 column: 2 bit_position: 5
[Hardware Error]:   error_type: 2, single-bit ECC

[Tomasz Nowicki: Clear error status at the end of error handling]
[Tony: Applied some cleanups suggested by Fu Wei]
[Fu Wei: delete EXPORT_SYMBOL_GPL(bert_disable), improve the code]
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NTomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
Tested-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: NFu Wei <fu.wei@linaro.org>
Tested-by: NTyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a3e2acc5

28 6月, 2016 2 次提交

clocksource/drivers/arm_arch_timer: Control the evtstrm via the cmdline · 46fd5c6b

由 Will Deacon 提交于 6月 27, 2016

Disabling the eventstream can be useful for both remotely debugging a
deployed production system and development of code using WFE-based
polling loops. Whilst this can currently be controlled via a Kconfig
option (CONFIG_ARM_ARCH_TIMER_EVTSTREAM), it's often desirable to toggle
the feature on the command line, so this patch adds a new command-line
option ("clocksource.arm_arch_timer.evtstrm") to do just that. The
default behaviour is determined based on CONFIG_ARM_ARCH_TIMER_EVTSTREAM.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>

46fd5c6b

s390/oprofile: remove hardware sampler support · 93dd49d0

由 Heiko Carstens 提交于 6月 10, 2016

Remove hardware sampler support from oprofile module.

The oprofile user space utilty has been switched to use the kernel
perf interface, for which we also provide hardware sampling support.

In addition the hardware sampling support is also slightly broken: it
supports only 16 bits for the pid and therefore would generate wrong
results on machines which have a pid >64k.

Also the pt_regs structure which was passed to oprofile common code
cannot necessarily be used to generate sane backtraces, since the
task(s) in question may run while the samples are fed to oprofile.
So the result would be more or less random.

However given that the only user space tools switched to the perf
interface already four years ago the hardware sampler code seems to be
unused code, and therefore it should be reasonable to remove it.

The timer based oprofile support continues to work.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: NAndreas Arnez <arnez@linux.vnet.ibm.com>
Acked-by: NAndreas Krebbel <krebbel@linux.vnet.ibm.com>
Acked-by: NRobert Richter <rric@kernel.org>
Reviewed-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

93dd49d0

26 6月, 2016 1 次提交

x86/KASLR, x86/power: Remove x86 hibernation restrictions · 65fe935d

由 Kees Cook 提交于 6月 13, 2016

With the following fix:

  70595b479ce1 ("x86/power/64: Fix crash whan the hibernation code passes control to the image kernel")

... there is no longer a problem with hibernation resuming a
KASLR-booted kernel image, so remove the restriction.
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux PM list <linux-pm@vger.kernel.org>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/20160613221002.GA29719@www.outflux.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

65fe935d

22 6月, 2016 1 次提交

PCI: Extending pci=resource_alignment to specify device/vendor IDs · 644a544f

由 Koehrer Mathias (ETAS/ESW5) 提交于 6月 07, 2016

Some uio-based PCI drivers, e.g., uio_cif do not work if the assigned PCI
memory resources are not page aligned.

By using the kernel option "pci=resource_alignment" it is possible to force
single PCI boards to use page alignment for their memory resources.
However, this is fairly cumbersome if several of these boards are in use
as the specification of the cards has to be done via PCI bus/slot/function
number which might change, e.g., by adding another board.

Extend the kernel option "pci=resource_alignment" to allow specification of
relevant devices via PCI device/vendor (and subdevice/subvendor) IDs. The
specification of the devices via device/vendor is indicated by a leading
string "pci:" as argument to "pci=resource_alignment". The format of the
specification is pci:<vendor>:<device>[:<subvendor>:<subdevice>]
Signed-off-by: NMathias Koehrer <mathias.koehrer@etas.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

644a544f

14 6月, 2016 1 次提交

PCI: Put PCIe ports into D3 during suspend · 9d26d3a8

由 Mika Westerberg 提交于 6月 02, 2016

Currently the Linux PCI core does not touch power state of PCI bridges and
PCIe ports when system suspend is entered.  Leaving them in D0 consumes
power unnecessarily and may prevent the CPU from entering deeper C-states.

With recent PCIe hardware we can power down the ports to save power given
that we take into account few restrictions:

  - The PCIe port hardware is recent enough, starting from 2015.

  - Devices connected to PCIe ports are effectively in D3cold once the port
    is transitioned to D3 (the config space is not accessible anymore and
    the link may be powered down).

  - Devices behind the PCIe port need to be allowed to transition to D3cold
    and back.  There is a way both drivers and userspace can forbid this.

  - If the device behind the PCIe port is capable of waking the system it
    needs to be able to do so from D3cold.

This patch adds a new flag to struct pci_device called 'bridge_d3'.  This
flag is set and cleared by the PCI core whenever there is a change in power
management state of any of the devices behind the PCIe port.  When system
later on is suspended we only need to check this flag and if it is true
transition the port to D3 otherwise we leave it in D0.

Also provide override mechanism via command line parameter
"pcie_port_pm=[off|force]" that can be used to disable or enable the
feature regardless of the BIOS manufacturing date.
Tested-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9d26d3a8

04 6月, 2016 1 次提交

doc: clarify that trace_events= takes a comma-separated list · d81749ea

由 Brian Norris 提交于 5月 23, 2016

It took me browsing through the source code to determine that I was,
indeed, using the wrong delimiter in my command lines. So I might as
well document it for the next person.
Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

d81749ea

20 5月, 2016 1 次提交

memory_hotplug: introduce memhp_default_state= command line parameter · 86dd995d

由 Vitaly Kuznetsov 提交于 5月 19, 2016

CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE specifies the default value for the
memory hotplug onlining policy.  Add a command line parameter to make it
possible to override the default.  It may come handy for debug and
testing purposes.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Lennart Poettering <lennart@poettering.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

86dd995d

05 5月, 2016 2 次提交

crypto: testmgr - Add a flag allowing the self-tests to be disabled at runtime. · 9e5c9fe4

由 Richard W.M. Jones 提交于 5月 03, 2016

Running self-tests for a short-lived KVM VM takes 28ms on my laptop.
This commit adds a flag 'cryptomgr.notests' which allows them to be
disabled.

However if fips=1 as well, we ignore this flag as FIPS mode mandates
that the self-tests are run.
Signed-off-by: NRichard W.M. Jones <rjones@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

9e5c9fe4

ACPI / osi: Add acpi_osi=!! to allow reverting acpi_osi=! · a707edeb

由 Lv Zheng 提交于 5月 03, 2016

This patch introduces acpi_osi=!! so that quirks may use it to revert
acpi_osi=!.
Tested-by: NLukas Wunner <lukas@wunner.de>
Tested-by: NChen Yu <yu.c.chen@intel.com>
Signed-off-by: NLv Zheng <lv.zheng@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a707edeb

01 5月, 2016 1 次提交

tty: serial: meson: Implement earlycon support · 736d5538

由 Andreas Färber 提交于 3月 06, 2016

Split off the bulk of the existing meson_serial_console_write()
implementation into meson_serial_port_write() for implementing
meson_serial_early_console_write().

Use "meson" as the earlycon driver name, courtesy of Nicolas.
Signed-off-by: NNicolas Saenz Julienne <nicolassaenzj@gmail.com>
Acked-by: NCarlo Caione <carlo@endlessm.com>
Signed-off-by: NAndreas Färber <afaerber@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

736d5538

28 4月, 2016 2 次提交

cpufreq: intel_pstate: Enable PPC enforcement for servers · 2b3ec765

由 Srinivas Pandruvada 提交于 4月 27, 2016

For platforms which are controlled via remove node manager, enable _PPC by
default. These platforms are mostly categorized as enterprise server or
performance servers. These platforms needs to go through some
certifications tests, which tests control via _PPC.
The relative risk of enabling by default is low as this is is less likely
that these systems have broken _PSS table.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2b3ec765

cpufreq: intel_pstate: Enforce _PPC limits · 9522a2ff

由 Srinivas Pandruvada 提交于 4月 27, 2016

Use ACPI _PPC notification to limit max P state driver will request.
ACPI _PPC change notification is sent by BIOS to limit max P state
in several cases:
- Reduce impact of platform thermal condition
- When Config TDP feature is used, a changed _PPC is sent to
follow TDP change
- Remote node managers in server want to control platform power
via baseboard management controller (BMC)

This change registers with ACPI processor performance lib so that
_PPC changes are notified to cpufreq core, which in turns will
result in call to .setpolicy() callback. Also the way _PSS
table identifies a turbo frequency is not compatible to max turbo
frequency in intel_pstate, so the very first entry in _PSS needs
to be adjusted.

This feature can be turned on by using kernel parameters:
intel_pstate=support_acpi_ppc
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Minor cleanups ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9522a2ff

26 4月, 2016 1 次提交

arm64: acpi: add acpi=on cmdline option to prefer ACPI boot over DT · 6a1f5471

由 Ard Biesheuvel 提交于 4月 12, 2016

If both ACPI and DT platform descriptions are available, and the
kernel was configured at build time to support both flavours, the
default policy is to prefer DT over ACPI, and preferring ACPI over
DT while still allowing DT as a fallback is not possible.

Since some enterprise features (such as RAS) depend on ACPI, it may
be desirable for, e.g., distro installers to prefer ACPI boot but
fall back to DT rather than failing completely if no ACPI tables are
available.

So introduce the 'acpi=on' kernel command line parameter for arm64,
which signifies that ACPI should be used if available, and DT should
only be used as a fallback.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6a1f5471

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年