提交 · b921f5dfc56b4aa84eb4309744664e687f31d8d9 · openanolis / cloud-kernel

30 10月, 2019 40 次提交

spi: dw-mmio: add ACPI support · b921f5df

由 Jay Fang 提交于 12月 03, 2018

commit 32215a6c6beb8dcda4bb0759b04ce3c30927963b upstream

The Hisilicon Hip08 platform, that uses ACPI, has this controller.
Let's add ACPI support for DW SPI MMIO-based host.

The ACPI ID used is "HISI0173" for the Designware SPI controller of
Hisilicon Hip08 platform.
Signed-off-by: NJay Fang <f.fangjian@huawei.com>
Signed-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

b921f5df

arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() · 31e33b52

由 Dongjiu Geng 提交于 10月 13, 2018

commit 375bdd3b5d4f7cf146f0df1488b4671b141dd799 upstream

Rename kvm_arch_dev_ioctl_check_extension() to
kvm_arch_vm_ioctl_check_extension(), because it does
not have any relationship with device.

Renaming this function can make code readable.

Cc: James Morse <james.morse@arm.com>
Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

31e33b52

ACPI / CPPC: Add a helper to get desired performance · 5f81fa7f

由 Xiongfeng Wang 提交于 2月 17, 2019

commit 1757d05f3112acc5c0cdbcccad3afdee99655bf9 upstream

This patch add a helper to get the value of desired performance
register.
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
[ rjw: More white space ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

5f81fa7f

cpufreq / cppc: Work around for Hisilicon CPPC cpufreq · b2e3814a

由 Xiongfeng Wang 提交于 2月 17, 2019

commit 6c8d750f9784cef32a8cffdad74c8a351b4ca3a6 upstream

Hisilicon chips do not support delivered performance counter register
and reference performance counter register. But the platform can
calculate the real performance using its own method. We reuse the
desired performance register to store the real performance calculated by
the platform. After the platform finished the frequency adjust, it gets
the real performance and writes it into desired performance register. Os
can use it to calculate the real frequency.
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
[ rjw: Drop unnecessary braces ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

b2e3814a

irqchip/gic-v3-its: Allow use of LPI tables in reserved memory · 3510b2ba

由 Marc Zyngier 提交于 7月 27, 2018

commit 5e2c9f9a627772672accd80fa15359c0de6aa894 upstream

If the LPI tables have been reserved with the EFI reservation
mechanism, we assume that these tables are safe to use even
when we find the redistributors to have LPIs enabled at
boot time, meaning that kexec can now work with GICv3.

You're welcome.
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

3510b2ba

irqchip/gic-v3-its: Register LPI tables with EFI config table · d2f5afb8

由 Marc Zyngier 提交于 7月 27, 2018

commit 3fb68faee8676900f896d1615442aeca36e5f940 upstream

Upon enabling a redistributor, let's register the allocated tables
with the EFI table that tracks the memory reservations.
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

d2f5afb8

irqchip/gic-v3-its: Check that all RDs have the same property table · b605556b

由 Marc Zyngier 提交于 7月 27, 2018

commit f842ca8e9c8a80d07f5589536311250d7d6018f9 upstream

If booting with LPIs enabled, all the redistributors must have the
exact same property table. No ifs, no buts.
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

b605556b

irqchip/gic-v3-its: Use pre-programmed redistributor tables with kdump kernels · 6461d57b

由 Marc Zyngier 提交于 6月 26, 2018

commit c6e2ccb66d0c3b4fffc59932585e9f709ad59003 upstream

If using a kdump kernel, and that we cannot disable LPIs to install
our own tables, let's switch to using the already allocated tables.

This means that we'll change some of the initial kernel's memory,
but at least we'll be able to have LPIs in this secondary kernel.
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

6461d57b

irqchip/gic-v3-its: Allow use of pre-programmed LPI tables · 0473ec5f

由 Marc Zyngier 提交于 7月 27, 2018

commit c440a9d9d113b9b3cd99bb5096c4aa47d515e463 upstream

In order to cope with kexec and GICv3, let's try and spot when
we're booting with LPIs already enabled, and the tables already
programmed into the redistributors.

This code is currently guarded by a predicate that is always false,
meaning this is not functionnal just yet.
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

0473ec5f

irqchip/gic-v3-its: Keep track of property table's PA and VA · 317fd9f4

由 Marc Zyngier 提交于 7月 27, 2018

commit e1a2e2010ba9d3c765b2e37a7ae8b332564716f1 upstream

We're currently only tracking the page allocated to contain the
property table by its struct page. In the future, it is going to
be convenient to track both PA and VA for that page instead. Let's
do that.
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

317fd9f4

irqchip/gic-v3-its: Move pending table allocation to init time · 22164b71

由 Marc Zyngier 提交于 7月 27, 2018

commit 11e37d357f6ba7a9af850a872396082cc0a0001f upstream

Pending tables for the redistributors are currently allocated
one at a time as each CPU boots. This is causing some grief
for Linux/RT (allocation from within a CPU hotplug notifier is
frown upon).

Let's move this allocation to take place at init time, when we
only have a single CPU. It means we're allocating memory for CPUs
that are not online yet, but most system will boot all of their
CPUs anyway, so that's not completely wasted.
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

22164b71

irqchip/gic-v3-its: Split property table clearing from allocation · c22e582d

由 Marc Zyngier 提交于 7月 27, 2018

commit 053be4854f9bcceba99cdfa0c89acc4696852c3f upstream

As we're going to reuse some pre-allocated memory for the property
table, split out the zeroing of that table into a separate function
for later use.
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

c22e582d

irqchip/gic-v3-its: Change initialization ordering for LPIs · e3d3dc2e

由 Marc Zyngier 提交于 7月 27, 2018

commit d38a71c5452529fd3326b0ae488292e5fbd8d2a1 upstream

We currently initialize the LPIs (and the ITS) fairly early, even
before the SMP support and the CPU interface. This is a bit odd
(as LPIs are not exactly crutial for the early boot process),
and is going to cause issues when reorganizing the probing code.

Let's move this initialization later.
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
Tested-by: NLei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

e3d3dc2e

iommu/arm-smmu-v3: Remove unnecessary wrapper function · b335c0c9

由 Andrew Murray 提交于 10月 10, 2018

commit 5e731073bc0a4a53a213412dbd33982d829560f1 upstream

Simplify the code by removing an unnecessary wrapper function.

This was left behind by commit 2f657add
("iommu/arm-smmu-v3: Specialise CMD_SYNC handling")
Signed-off-by: NAndrew Murray <andrew.murray@arm.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

b335c0c9

iommu/arm-smmu: Support non-strict mode · 3587d7e5

由 Robin Murphy 提交于 9月 20, 2018

commit 44f6876a00e83df5fd28681502b19b0f51e4a3c6 upstream

All we need is to wire up .flush_iotlb_all properly and implement the
domain attribute, and iommu-dma and io-pgtable will do the rest for us.
The only real subtlety is documenting the barrier semantics we're
introducing between io-pgtable and the drivers for non-strict flushes.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

3587d7e5

iommu/io-pgtable-arm-v7s: Add support for non-strict mode · 5fb01cd3

由 Robin Murphy 提交于 9月 20, 2018

commit b2dfeba654cb08db327d0ed4547b66c2f8fce997 upstream

As for LPAE, it's simply a case of skipping the leaf invalidation for a
regular unmap, and ensuring that the one in split_blk_unmap() is paired
with an explicit sync ASAP rather than relying on one which might only
eventually happen way down the line.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

5fb01cd3

iommu/arm-smmu-v3: Add support for non-strict mode · 8bb6210c

由 Zhen Lei 提交于 9月 20, 2018

commit 9662b99a19abccb0b7bfc91abb3fec1447c35bf0 upstream

Now that io-pgtable knows how to dodge strict TLB maintenance, all
that's left to do is bridge the gap between the IOMMU core requesting
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE for default domains, and showing the
appropriate IO_PGTABLE_QUIRK_NON_STRICT flag to alloc_io_pgtable_ops().
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
[rm: convert to domain attribute, tweak commit message]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

8bb6210c

iommu/io-pgtable-arm: Add support for non-strict mode · 2576a534

由 Zhen Lei 提交于 9月 20, 2018

commit b6b65ca20bc93d14319f9b5cf98fd3c19a4244e3 upstream

Non-strict mode is simply a case of skipping 'regular' leaf TLBIs, since
the sync is already factored out into ops->iotlb_sync at the core API
level. Non-leaf invalidations where we change the page table structure
itself still have to be issued synchronously in order to maintain walk
caches correctly.

To save having to reason about it too much, make sure the invalidation
in arm_lpae_split_blk_unmap() just performs its own unconditional sync
to minimise the window in which we're technically violating the break-
before-make requirement on a live mapping. This might work out redundant
with an outer-level sync for strict unmaps, but we'll never be splitting
blocks on a DMA fastpath anyway.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
[rm: tweak comment, commit message, split_blk_unmap logic and barriers]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

2576a534

iommu: Add "iommu.strict" command line option · 2757989e

由 Zhen Lei 提交于 9月 20, 2018

commit 68a6efe86f6a16e25556a2aff40efad41097b486 upstream

Add a generic command line option to enable lazy unmapping via IOVA
flush queues, which will initally be suuported by iommu-dma. This echoes
the semantics of "intel_iommu=strict" (albeit with the opposite default
value), but in the driver-agnostic fashion of "iommu.passthrough".
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
[rm: move handling out of SMMUv3 driver, clean up documentation]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: dropped broken printk when parsing command-line option]
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

2757989e

iommu/dma: Add support for non-strict mode · c6801dcd

由 Zhen Lei 提交于 9月 20, 2018

commit 2da274cdf998a1c12afa6b5975db2df1df01edf1 upstream

With the flush queue infrastructure already abstracted into IOVA
domains, hooking it up in iommu-dma is pretty simple. Since there is a
degree of dependency on the IOMMU driver knowing what to do to play
along, we key the whole thing off a domain attribute which will be set
on default DMA ops domains to request non-strict invalidation. That way,
drivers can indicate the appropriate support by acknowledging the
attribute, and we can easily fall back to strict invalidation otherwise.

The flush queue callback needs a handle on the iommu_domain which owns
our cookie, so we have to add a pointer back to that, but neatly, that's
also sufficient to indicate whether we're using a flush queue or not,
and thus which way to release IOVAs. The only slight subtlety is
switching __iommu_dma_unmap() from calling iommu_unmap() to explicit
iommu_unmap_fast()/iommu_tlb_sync() so that we can elide the sync
entirely in non-strict mode.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
[rm: convert to domain attribute, tweak comments and commit message]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

c6801dcd

iommu/arm-smmu-v3: Implement flush_iotlb_all hook · 73afcbcd

由 Zhen Lei 提交于 9月 20, 2018

commit 07fdef34d2be6811f00c6f9e4e2a1483cf86696c upstream

.flush_iotlb_all is currently stubbed to arm_smmu_iotlb_sync() since the
only time it would ever need to actually do anything is for callers
doing their own explicit batching, e.g.:

	iommu_unmap_fast(domain, ...);
	iommu_unmap_fast(domain, ...);
	iommu_iotlb_flush_all(domain, ...);

where since io-pgtable still issues the TLBI commands implicitly in the
unmap instead of implementing .iotlb_range_add, the "flush" only needs
to ensure completion of those already-in-flight invalidations.

However, we're about to start using it in anger with flush queues, so
let's get a proper implementation wired up.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
[rm: document why it wasn't a bug]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

73afcbcd

iommu/arm-smmu-v3: Avoid back-to-back CMD_SYNC operations · 39cf7f88

由 Zhen Lei 提交于 8月 19, 2018

commit 901510ee32f7190902f6fe4affb463e5d86a804c upstream

Putting adjacent CMD_SYNCs into the command queue is nonsensical, but
can happen when multiple CPUs are inserting commands. Rather than leave
the poor old hardware to chew through these operations, we can instead
drop the subsequent SYNCs and poll for completion of the first. This
has been shown to improve IO performance under pressure, where the
number of SYNC operations reduces by about a third:

	CMD_SYNCs reduced:	19542181
	CMD_SYNCs total:	58098548	(include reduced)
	CMDs total:		116197099	(TLBI:SYNC about 1:1)
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

39cf7f88

iommu/arm-smmu-v3: Fix unexpected CMD_SYNC timeout · 8c02158d

由 Zhen Lei 提交于 8月 19, 2018

commit 0f02477d16980938a84aba8688a4e3a303306116 upstream

The condition break condition of:

	(int)(VAL - sync_idx) >= 0

in the __arm_smmu_sync_poll_msi() polling loop requires that sync_idx
must be increased monotonically according to the sequence of the CMDs in
the cmdq.

However, since the msidata is populated using atomic_inc_return_relaxed()
before taking the command-queue spinlock, then the following scenario
can occur:

CPU0			CPU1
msidata=0
			msidata=1
			insert cmd1
insert cmd0
			smmu execute cmd1
smmu execute cmd0
			poll timeout, because msidata=1 is overridden by
			cmd0, that means VAL=0, sync_idx=1.

This is not a functional problem, since the caller will eventually either
timeout or exit due to another CMD_SYNC, however it's clearly not what
the code is supposed to be doing. Fix it, by incrementing the sequence
count with the command-queue lock held, allowing us to drop the atomic
operations altogether.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
[will: dropped the specialised cmd building routine for now]
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

8c02158d

iommu/io-pgtable-arm: Fix race handling in split_blk_unmap() · 0c9e76ff

由 Robin Murphy 提交于 9月 06, 2018

commit 85c7a0f1ef624ef58173ef52ea77780257bdfe04 upstream

In removing the pagetable-wide lock, we gained the possibility of the
vanishingly unlikely case where we have a race between two concurrent
unmappers splitting the same block entry. The logic to handle this is
fairly straightforward - whoever loses the race frees their partial
next-level table and instead dereferences the winner's newly-installed
entry in order to fall back to a regular unmap, which intentionally
echoes the pre-existing case of recursively splitting a 1GB block down
to 4KB pages by installing a full table of 2MB blocks first.

Unfortunately, the chump who implemented that logic failed to update the
condition check for that fallback, meaning that if said race occurs at
the last level (where the loser's unmap_idx is valid) then the unmap
won't actually happen. Fix that to properly account for both the race
and recursive cases.

Fixes: 2c3d273e ("iommu/io-pgtable-arm: Support lockless operation")
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: re-jig control flow to avoid duplicate cmpxchg test]
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

0c9e76ff

iommu/arm-smmu-v3: Fix a couple of minor comment typos · b5c2f25c

由 John Garry 提交于 8月 17, 2018

commit 657135f3108122556c3cf60a78c6f0e76aeb60e6 commit

Fix some comment typos spotted.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
Reviewed-by: NBaoyou Xie <xie.baoyou@linux.alibaba.com>

b5c2f25c

sched/psi: Correct overly pessimistic size calculation · 0801de60

由 Miles Chen 提交于 9月 12, 2019

commit 4adcdcea717cb2d8436bef00dd689aa5bc76f11b upstream.

When passing a equal or more then 32 bytes long string to psi_write(),
psi_write() copies 31 bytes to its buf and overwrites buf[30]
with '\0'. Which makes the input string 1 byte shorter than
it should be.

Fix it by copying sizeof(buf) bytes when nbytes >= sizeof(buf).

This does not cause problems in normal use case like:
"some 500000 10000000" or "full 500000 10000000" because they
are less than 32 bytes in length.

	/* assuming nbytes == 35 */
	char buf[32];

	buf_size = min(nbytes, (sizeof(buf) - 1)); /* buf_size = 31 */
	if (copy_from_user(buf, user_buf, buf_size))
		return -EFAULT;

	buf[buf_size - 1] = '\0'; /* buf[30] = '\0' */

Before:

 %cd /proc/pressure/
 %echo "123456789|123456789|123456789|1234" > memory
 [   22.473497] nbytes=35,buf_size=31
 [   22.473775] 123456789|123456789|123456789| (print 30 chars)
 %sh: write error: Invalid argument

 %echo "123456789|123456789|123456789|1" > memory
 [   64.916162] nbytes=32,buf_size=31
 [   64.916331] 123456789|123456789|123456789| (print 30 chars)
 %sh: write error: Invalid argument

After:

 %cd /proc/pressure/
 %echo "123456789|123456789|123456789|1234" > memory
 [  254.837863] nbytes=35,buf_size=32
 [  254.838541] 123456789|123456789|123456789|1 (print 31 chars)
 %sh: write error: Invalid argument

 %echo "123456789|123456789|123456789|1" > memory
 [ 9965.714935] nbytes=32,buf_size=32
 [ 9965.715096] 123456789|123456789|123456789|1 (print 31 chars)
 %sh: write error: Invalid argument

Also remove the superfluous parentheses.
Signed-off-by: NMiles Chen <miles.chen@mediatek.com>
Cc: <linux-mediatek@lists.infradead.org>
Cc: <wsd_upstream@mediatek.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190912103452.13281-1-miles.chen@mediatek.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

0801de60

sched/psi: Do not require setsched permission from the trigger creator · 86511360

由 Suren Baghdasaryan 提交于 7月 29, 2019

commit 04e048cf09d7b5fc995817cdc5ae1acd4482429c upstream.

When a process creates a new trigger by writing into /proc/pressure/*
files, permissions to write such a file should be used to determine whether
the process is allowed to do so or not. Current implementation would also
require such a process to have setsched capability. Setting of psi trigger
thread's scheduling policy is an implementation detail and should not be
exposed to the user level. Remove the permission check by using _nocheck
version of the function.
Suggested-by: NNick Kralevich <nnk@google.com>
Signed-off-by: NSuren Baghdasaryan <surenb@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: lizefan@huawei.com
Cc: mingo@redhat.com
Cc: akpm@linux-foundation.org
Cc: kernel-team@android.com
Cc: dennisszhou@gmail.com
Cc: dennis@kernel.org
Cc: hannes@cmpxchg.org
Cc: axboe@kernel.dk
Link: https://lkml.kernel.org/r/20190730013310.162367-1-surenb@google.comSigned-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

86511360

sched/psi: Reduce psimon FIFO priority · 97ff1a75

由 Peter Zijlstra 提交于 8月 01, 2019

commit 14f5c7b46a41a595fc61db37f55721714729e59e upstream.

PSI defaults to a FIFO-99 thread, reduce this to FIFO-1.

FIFO-99 is the very highest priority available to SCHED_FIFO and
it not a suitable default; it would indicate the psi work is the
most important work on the machine.

Since Real-Time tasks will have pre-allocated memory and locked it in
place, Real-Time tasks do not care about PSI. All it needs is to be
above OTHER.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Tested-by: NSuren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

97ff1a75

blk-cgroup: turn on psi memstall stuff · 9a8eb1ab

由 Josef Bacik 提交于 7月 09, 2019

commit fd112c74652371a023f85d87b70bee7169e8f4d0 upstream.

With the psi stuff in place we can use the memstall flag to indicate
pressure that happens from throttling.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

9a8eb1ab

zswap: use movable memory if zpool support allocate movable memory · 0a943f50

由 Hui Zhu 提交于 9月 27, 2019

commit d2fcd82bb83aab47c6d63aa8c960cd5edb578065 upstream

This is the third version that was updated according to the comments from
Sergey Senozhatsky https://lkml.org/lkml/2019/5/29/73 and Shakeel Butt
https://lkml.org/lkml/2019/6/4/973

zswap compresses swap pages into a dynamically allocated RAM-based memory
pool.  The memory pool should be zbud, z3fold or zsmalloc.  All of them
will allocate unmovable pages.  It will increase the number of unmovable
page blocks that will bad for anti-fragment.

zsmalloc support page migration if request movable page:
        handle = zs_malloc(zram->mem_pool, comp_len,
                GFP_NOIO | __GFP_HIGHMEM |
                __GFP_MOVABLE);

And commit "zpool: Add malloc_support_movable to zpool_driver" add
zpool_malloc_support_movable check malloc_support_movable to make sure if
a zpool support allocate movable memory.

This commit let zswap allocate block with gfp
__GFP_HIGHMEM | __GFP_MOVABLE if zpool support allocate movable memory.

Following part is test log in a pc that has 8G memory and 2G swap.

Without this commit:
~# echo lz4 > /sys/module/zswap/parameters/compressor
~# echo zsmalloc > /sys/module/zswap/parameters/zpool
~# echo 1 > /sys/module/zswap/parameters/enabled
~# swapon /swapfile
~# cd /home/teawater/kernel/vm-scalability/
/home/teawater/kernel/vm-scalability# export unit_size=$((9 * 1024 * 1024 * 1024))
/home/teawater/kernel/vm-scalability# ./case-anon-w-seq
2717908992 bytes / 4826062 usecs = 549973 KB/s
2717908992 bytes / 4864201 usecs = 545661 KB/s
2717908992 bytes / 4867015 usecs = 545346 KB/s
2717908992 bytes / 4915485 usecs = 539968 KB/s
397853 usecs to free memory
357820 usecs to free memory
421333 usecs to free memory
420454 usecs to free memory
/home/teawater/kernel/vm-scalability# cat /proc/pagetypeinfo
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
Node    0, zone      DMA, type    Unmovable      1      1      1      0      2      1      1      0      1      0      0
Node    0, zone      DMA, type      Movable      0      0      0      0      0      0      0      0      0      1      3
Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type    Unmovable      6      5      8      6      6      5      4      1      1      1      0
Node    0, zone    DMA32, type      Movable     25     20     20     19     22     15     14     11     11      5    767
Node    0, zone    DMA32, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type    Unmovable   4753   5588   5159   4613   3712   2520   1448    594    188     11      0
Node    0, zone   Normal, type      Movable     16      3    457   2648   2143   1435    860    459    223    224    296
Node    0, zone   Normal, type  Reclaimable      0      0     44     38     11      2      0      0      0      0      0
Node    0, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate
Node 0, zone      DMA            1            7            0            0            0            0
Node 0, zone    DMA32            4         1652            0            0            0            0
Node 0, zone   Normal          931         1485           15            0            0            0

With this commit:
~# echo lz4 > /sys/module/zswap/parameters/compressor
~# echo zsmalloc > /sys/module/zswap/parameters/zpool
~# echo 1 > /sys/module/zswap/parameters/enabled
~# swapon /swapfile
~# cd /home/teawater/kernel/vm-scalability/
/home/teawater/kernel/vm-scalability# export unit_size=$((9 * 1024 * 1024 * 1024))
/home/teawater/kernel/vm-scalability# ./case-anon-w-seq
2717908992 bytes / 4689240 usecs = 566020 KB/s
2717908992 bytes / 4760605 usecs = 557535 KB/s
2717908992 bytes / 4803621 usecs = 552543 KB/s
2717908992 bytes / 5069828 usecs = 523530 KB/s
431546 usecs to free memory
383397 usecs to free memory
456454 usecs to free memory
224487 usecs to free memory
/home/teawater/kernel/vm-scalability# cat /proc/pagetypeinfo
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
Node    0, zone      DMA, type    Unmovable      1      1      1      0      2      1      1      0      1      0      0
Node    0, zone      DMA, type      Movable      0      0      0      0      0      0      0      0      0      1      3
Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type    Unmovable     10      8     10      9     10      4      3      2      3      0      0
Node    0, zone    DMA32, type      Movable     18     12     14     16     16     11      9      5      5      6    775
Node    0, zone    DMA32, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      1
Node    0, zone    DMA32, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type    Unmovable   2669   1236    452    118     37     14      4      1      2      3      0
Node    0, zone   Normal, type      Movable   3850   6086   5274   4327   3510   2494   1520    934    438    220    470
Node    0, zone   Normal, type  Reclaimable     56     93    155    124     47     31     17      7      3      0      0
Node    0, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate
Node 0, zone      DMA            1            7            0            0            0            0
Node 0, zone    DMA32            4         1650            2            0            0            0
Node 0, zone   Normal           79         2326           26            0            0            0

You can see that the number of unmovable page blocks is decreased
when the kernel has this commit.

Link: http://lkml.kernel.org/r/20190605100630.13293-2-teawaterz@linux.alibaba.comSigned-off-by: NHui Zhu <teawaterz@linux.alibaba.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

0a943f50

zpool: add malloc_support_movable to zpool_driver · e874b6e5

由 Hui Zhu 提交于 9月 27, 2019

commit c165f25d23ecb2f9f121ced20435415b931219e2 upstream

As a zpool_driver, zsmalloc can allocate movable memory because it support
migate pages.  But zbud and z3fold cannot allocate movable memory.

Add malloc_support_movable to zpool_driver.  If a zpool_driver support
allocate movable memory, set it to true.  And add
zpool_malloc_support_movable check malloc_support_movable to make sure if
a zpool support allocate movable memory.

Link: http://lkml.kernel.org/r/20190605100630.13293-1-teawaterz@linux.alibaba.comSigned-off-by: NHui Zhu <teawaterz@linux.alibaba.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>

e874b6e5

ip_sockglue: Fix missing-check bug in ip_ra_control() · c6a8ac6f

由 Gen Zhang 提交于 5月 24, 2019

commit 425aa0e1d01513437668fa3d4a971168bbaa8515 upstream.

In function ip_ra_control(), the pointer new_ra is allocated a memory
space via kmalloc(). And it is used in the following codes. However,
when  there is a memory allocation error, kmalloc() fails. Thus null
pointer dereference may happen. And it will cause the kernel to crash.
Therefore, we should check the return value and handle the error.
Signed-off-by: NGen Zhang <blackgod016574@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

c6a8ac6f

efi/x86/Add missing error handling to old_memmap 1:1 mapping code · b47d082f

由 Gen Zhang 提交于 5月 25, 2019

commit 4e78921ba4dd0aca1cc89168f45039add4183f8e upstream.

The old_memmap flow in efi_call_phys_prolog() performs numerous memory
allocations, and either does not check for failure at all, or it does
but fails to propagate it back to the caller, which may end up calling
into the firmware with an incomplete 1:1 mapping.

So let's fix this by returning NULL from efi_call_phys_prolog() on
memory allocation failures only, and by handling this condition in the
caller. Also, clean up any half baked sets of page tables that we may
have created before returning with a NULL return value.

Note that any failure at this level will trigger a panic() two levels
up, so none of this makes a huge difference, but it is a nice cleanup
nonetheless.

[ardb: update commit log, add efi_call_phys_epilog() call on error path]
Signed-off-by: NGen Zhang <blackgod016574@gmail.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Bradford <robert.bradford@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20190525112559.7917-2-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

b47d082f

ipv6_sockglue: Fix a missing-check bug in ip6_ra_control() · 1f3d34ec

由 Gen Zhang 提交于 5月 24, 2019

commit 95baa60a0da80a0143e3ddd4d3725758b4513825 upstream.

In function ip6_ra_control(), the pointer new_ra is allocated a memory
space via kmalloc(). And it is used in the following codes. However,
when there is a memory allocation error, kmalloc() fails. Thus null
pointer dereference may happen. And it will cause the kernel to crash.
Therefore, we should check the return value and handle the error.
Signed-off-by: NGen Zhang <blackgod016574@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

1f3d34ec

scsi: mpt3sas_ctl: fix double-fetch bug in _ctl_ioctl_main() · b9947abb

由 Gen Zhang 提交于 5月 30, 2019

commit f9e3ebeea4521652318af903cddeaf033527e93e upstream.

In _ctl_ioctl_main(), 'ioctl_header' is fetched the first time from
userspace. 'ioctl_header.ioc_number' is then checked. The legal result is
saved to 'ioc'. Then, in condition MPT3COMMAND, the whole struct is fetched
again from the userspace. Then _ctl_do_mpt_command() is called, 'ioc' and
'karg' as inputs.

However, a malicious user can change the 'ioc_number' between the two
fetches, which will cause a potential security issues.  Moreover, a
malicious user can provide a valid 'ioc_number' to pass the check in first
fetch, and then modify it in the second fetch.

To fix this, we need to recheck the 'ioc_number' in the second fetch.
Signed-off-by: NGen Zhang <blackgod016574@gmail.com>
Acked-by: NSuganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

b9947abb

clk-sunxi: fix a missing-check bug in sunxi_divs_clk_setup() · 6731e4fa

由 Gen Zhang 提交于 5月 28, 2019

commit fcdf445ff42f036d22178b49cf64e92d527c1330 upstream.

In sunxi_divs_clk_setup(), 'derived_name' is allocated by kstrndup().
It returns NULL when fails. 'derived_name' should be checked.
Signed-off-by: NGen Zhang <blackgod016574@gmail.com>
Signed-off-by: NMaxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

6731e4fa

powerpc/pseries/dlpar: Fix a missing check in dlpar_parse_cc_property() · 8625d444

由 Gen Zhang 提交于 5月 26, 2019

commit efa9ace68e487ddd29c2b4d6dd23242158f1f607 upstream.

In dlpar_parse_cc_property(), 'prop->name' is allocated by kstrdup().
kstrdup() may return NULL, so it should be checked and handle error.
And prop should be freed if 'prop->name' is NULL.
Signed-off-by: NGen Zhang <blackgod016574@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

8625d444

e1000e: increase pause and refresh time · 7373fb55

由 Miguel Bernal Marin 提交于 3月 27, 2017

commit f74dc880098b4a29f76d756b888fb31d81ad9a0c upstream.
Suggested-by: NTim Pepper <timothy.c.pepper@linux.intel.com>
Signed-off-by: NMiguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Signed-off-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Acked-by: NSasha Neftin <sasha.neftin@intel.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

7373fb55

reduce e1000e boot time by tightening sleep ranges · 14898532

由 Arjan van de Ven 提交于 7月 25, 2016

commit ab6973aed6200510662856afce5e3d1e386b7b64 upstream.

The e1000e driver is a great user of the usleep_range() API,
and has any nice ranges that in principle help power management.

However the ranges that are used only during system startup are
very long (and can add easily 100 msec to the boot time) while
the power savings of such long ranges is irrelevant due to the
one-off, boot only, nature of these functions.

This patch shrinks some of the longest ranges to be shorter
(while still using a power friendly 1 msec range); this saves
100msec+ of boot time on my BDW NUCs
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

14898532

psi: get poll_work to run when calling poll syscall next time · 0b57115b

由 Jason Xing 提交于 7月 30, 2019

Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.

Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.

The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag.
Signed-off-by: NJason Xing <kerneljasonxing@linux.alibaba.com>
Reviewed-by: NCaspar Zhang <caspar@linux.alibaba.com>
Reviewed-by: NSuren Baghdasaryan <surenb@google.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

0b57115b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功