提交 · 6ac2f49edb1ef5446089c7c660017732886d62d6 · openanolis / cloud-kernel

06 6月, 2018 2 次提交

x86/bugs: Add AMD's SPEC_CTRL MSR usage · 6ac2f49e

由 Konrad Rzeszutek Wilk 提交于 6月 01, 2018

The AMD document outlining the SSBD handling
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
mentions that if CPUID 8000_0008.EBX[24] is set we should be using
the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f)
for speculative store bypass disable.

This in effect means we should clear the X86_FEATURE_VIRT_SSBD
flag so that we would prefer the SPEC_CTRL MSR.

See the document titled:
   124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf

A copy of this document is available at
   https://bugzilla.kernel.org/show_bug.cgi?id=199889Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: kvm@vger.kernel.org
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: andrew.cooper3@citrix.com
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20180601145921.9500-3-konrad.wilk@oracle.com

6ac2f49e

x86/bugs: Add AMD's variant of SSB_NO · 24809860

由 Konrad Rzeszutek Wilk 提交于 6月 01, 2018

The AMD document outlining the SSBD handling
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
mentions that the CPUID 8000_0008.EBX[26] will mean that the
speculative store bypass disable is no longer needed.

A copy of this document is available at:
    https://bugzilla.kernel.org/show_bug.cgi?id=199889Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: kvm@vger.kernel.org
Cc: andrew.cooper3@citrix.com
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/20180601145921.9500-2-konrad.wilk@oracle.com

24809860

31 5月, 2018 9 次提交

perf/x86/intel/uncore: Clean up client IMC uncore · 9aae1780

由 Kan Liang 提交于 5月 03, 2018

The counters in client IMC uncore are free running counters, not fixed
counters. It should be corrected. The new infrastructure for free
running counter should be applied.

Introducing a new type SNB_PCI_UNCORE_IMC_DATA for client IMC free
running counters.

Keeping the customized event_init() function to be compatible with old
event encoding.

Clean up other customized event_*() functions.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-8-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

9aae1780

perf/x86/intel/uncore: Expose uncore_pmu_event*() functions · 5a6c9d94

由 Kan Liang 提交于 5月 03, 2018

Some uncores have customized PMU. For customized PMU, it does not need
to customize everything. For example, it only needs to customize init()
function for client IMC uncore. Other functions like
add()/del()/start()/stop()/read() can use generic code.

Expose the uncore_pmu_event_add/del/start/stop() functions.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-7-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

5a6c9d94

perf/x86/intel/uncore: Support IIO free-running counters on SKX · 0f519f03

由 Kan Liang 提交于 5月 03, 2018

As of Skylake Server, there are a number of free running counters in
each IIO Box that collect counts of per-box IO clocks and per-port
Input/Output x BW/Utilization.

The free running counters cannot be part of the existing IIO BOX,
because, quoting from Peter Zijlstra:

  "This will result in some (probably) unexpected scheduling artifacts.
   Probably the only way to really cure that is to have the free running
   counters in their own PMU and not share with the GP counters of this
   box."

So let's add a new PMU for the free running counters, as suggested.

The free-running counter is read-only and always active. Counting will
be suspended only when the IIO Box is powered down.

There are three types of IIO free-running counters on Skylake server, IO
CLOCKS counter, BANDWIDTH counters and UTILIZATION counters.
IO CLOCKS counter is a clock of IIO box.
BANDWIDTH counters are to count inbound(PCIe->CPU)/outbound(CPU->PCIe)
bandwidth.
UTILIZATION counters are to count input/output utilization.

The bit width of the free-running counters is 36-bits.
Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-6-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0f519f03

perf/x86/intel/uncore: Add infrastructure for free running counters · 0e0162df

由 Kan Liang 提交于 5月 03, 2018

There are a number of free running counters introduced for uncore, which
provide highly valuable information to a wide array of customers.
However, the generic uncore code doesn't support them yet.

The free running counters will be specially handled based on their
unique attributes:

 - They are read-only. They cannot be enabled/disabled.

 - The event and the counter are always 1:1 mapped. It doesn't need to
   be assigned nor tracked by event_list.

 - They are always active. It doesn't need to check the availability.

 - They have different bit width.

Also, using inline helpers to replace the check for fixed counter and
free running counter.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-5-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0e0162df

perf/x86/intel/uncore: Add new data structures for free running counters · 927b2deb

由 Kan Liang 提交于 5月 03, 2018

There are a number of free running counters introduced for uncore, which
provide highly valuable information to a wide array of customers.
For example, Skylake Server has IIO free running counters to collect
Input/Output x BW/Utilization.

There is NO event available on the general purpose counters, that is
exactly the same as the free running counters. The generic uncore code
needs to be enhanced to support the new counters.

In the uncore document, there is no event-code assigned to free running
counters. Some events need to be defined to indicate the free running
counters. The events are encoded as event-code + umask-code.

The event-code for all free running counters is 0xff, which is the same
as the fixed counters:

- It has not been decided what code will be used for common events on
  future platforms. 0xff is the only one which will definitely not be
  used as any common event-code.
- Cannot re-use current events on the general purpose counters. Because
  there is NO event available, that is exactly the same as the free
  running counters.
- Even in the existing codes, the fixed counters for core, that have the
  same event-code, may count different things. Hence, it should not
  surprise the users if the free running counters that share the same
  event-code also count different things.
  Umask will be used to distinguish the counters.

The umask-code is used to distinguish a fixed counter and a free running
counter, and different types of free running counters.

For fixed counters, the umask-code is 0x0X, where X indicates the index
of the fixed counter, which starts from 0.

 - Compatible with the old event encoding.

 - Currently, there is only one fixed counter. There are still 15
   reserved spaces for extension.

For free running counters, the umask-code uses the rest of the space.
It would follow the format of 0xXY:

 - X stands for the type of free running counters, which starts from 1.

 - Y stands for the index of free running counters of same type, which
   starts from 0.

- The free running counters do different thing. It can be categorized to
  several types, according to the MSR location, bit width and
  definition. E.g. there are three types of IIO free running counters on
  Skylake server to monitor IO CLOCKS, BANDWIDTH and UTILIZATION  on
  different ports. It makes it easy to locate the free running counter
  of a specific type.

- So far, there are at most 8 counters of each type.  There are still 8
  reserved spaces for extension.

Introducing a new index to indicate the free running counters. Only one
index is enough for all free running counters. Because the free running
counters are always active, and the event and free running counter are
always 1:1 mapped, it does not need extra index to indicate the assigned
counter.

Introducing a new data structure to store free running counters related
information for each type. It includes the number of counters, bit
width, base address, offset between counters and offset between boxes.

Introducing several inline helpers to check index for fixed counter and
free running counter, validate free running counter event, and retrieve
the free running counter information according to box and event.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-4-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

927b2deb

perf/x86/intel/uncore: Correct fixed counter index check in generic code · 4749f819

由 Kan Liang 提交于 5月 03, 2018

There is no index which is bigger than UNCORE_PMC_IDX_FIXED. The only
exception is client IMC uncore, which has been specially handled.
For generic code, it is not correct to use >= to check fixed counter.
The code quality issue will bring problem when a new counter index is
introduced.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-3-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

4749f819

perf/x86/intel/uncore: Correct fixed counter index check for NHM · d71f11c0

由 Kan Liang 提交于 5月 03, 2018

For Nehalem and Westmere, there is only one fixed counter for W-Box.
There is no index which is bigger than UNCORE_PMC_IDX_FIXED.
It is not correct to use >= to check fixed counter.
The code quality issue will bring problem when new counter index is
introduced.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-2-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d71f11c0

perf/x86/intel/uncore: Introduce customized event_read() for client IMC uncore · 2da33146

由 Kan Liang 提交于 5月 03, 2018

There are two free-running counters for client IMC uncore. The
customized event_init() function hard codes their index to
'UNCORE_PMC_IDX_FIXED' and 'UNCORE_PMC_IDX_FIXED + 1'.
To support the index 'UNCORE_PMC_IDX_FIXED + 1', the generic
uncore_perf_event_update is obscurely hacked.
The code quality issue will bring problems when a new counter index is
introduced into the generic code, for example, a new index for
free-running counter.

Introducing a customized event_read() function for client IMC uncore.
The customized function is copied from previous generic
uncore_pmu_event_read().
The index 'UNCORE_PMC_IDX_FIXED + 1' will be isolated for client IMC
uncore only.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

2da33146

m68k: Set default dma mask for platform devices · b12c8a70

由 Finn Thain 提交于 5月 17, 2018

This avoids a WARNING splat when loading the macsonic or macmace driver.
Please see commit 205e1b7f ("dma-mapping: warn when there is no
coherent_dma_mask").

This implementation of arch_setup_pdev_archdata() differs from the
powerpc one, in that this one avoids clobbering a device dma mask
which has already been initialized.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
Acked-by: NGreg Ungerer <gerg@linux-m68k.org>
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>

b12c8a70

29 5月, 2018 4 次提交

signal/sh: Stop gcc warning about an impossible case in do_divide_error · 26da3501

由 Eric W. Biederman 提交于 5月 29, 2018

Geert Uytterhoeven <geert@linux-m68k.org> reported:
>   HOSTLD  scripts/mod/modpost
>   CC      arch/sh/kernel/traps_32.o
> arch/sh/kernel/traps_32.c: In function 'do_divide_error':
> arch/sh/kernel/traps_32.c:606:17: error: 'code' may be used uninitialized in this function [-Werror=uninitialized]
> cc1: all warnings being treated as errors

It is clear from inspection that do_divide_error is only called with
TRAP_DIVZERO_ERROR or TRAP_DIVOVF_ERROR, as that is the way
set_exception_table_vec is called.  So let gcc know the other cases
should not be considered by returning in all other cases.

This removes the warning and let's the code continue to build.
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Fixes: c65626c0 ("signal/sh: Use force_sig_fault where appropriate")
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

26da3501

nds32: use generic dma_noncoherent_ops · 267d2e18

由 Christoph Hellwig 提交于 5月 28, 2018

Switch to the generic noncoherent direct mapping implementation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NGreentime Hu <greentime@andestech.com>
Tested-by: NGreentime Hu <greentime@andestech.com>

267d2e18

nds32: implement the unmap_sg DMA operation · f860122c

由 Christoph Hellwig 提交于 5月 19, 2018

This matches the implementation of the more commonly used unmap_single
routines and the sync_sg_for_cpu method which should provide equivalent
cache maintainance.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NGreentime Hu <greentime@andestech.com>
Tested-by: NGreentime Hu <greentime@andestech.com>

f860122c

nds32: consolidate DMA cache maintainance routines · 4ac1c68e

由 Christoph Hellwig 提交于 5月 19, 2018

Make sure all other DMA methods call nds32_dma_sync_single_for_{device,cpu}
to perform cache maintaince, and remove the consisteny_sync helper that
implemented both with entirely separate code based off an argument.

Also make sure these helpers handled highmem properly, for which code
is copy and pasted from mips.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NGreentime Hu <greentime@andestech.com>
Tested-by: NGreentime Hu <greentime@andestech.com>

4ac1c68e

28 5月, 2018 3 次提交

x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag · 0ead51c3

由 Christoph Hellwig 提交于 5月 28, 2018

Instead of globally disabling > 32bit DMA using the arch_dma_supported
hook walk the PCI bus under the actually affected bridge and mark every
device with the dma_32bit_limit flag.  This also gets rid of the
arch_dma_supported hook entirely.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

0ead51c3

x86/pci-dma: remove the explicit nodac and allowdac option · 098afd98

由 Christoph Hellwig 提交于 4月 27, 2018

This is something drivers should decide (modulo chipset quirks like
for VIA), which as far as I can tell is how things have been handled
for the last 15 years.

Note that we keep the usedac option for now, as it is used in the wild
to override the too generic VIA quirk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

098afd98

x86/pci-dma: remove the experimental forcesac boot option · 06e9552f

由 Christoph Hellwig 提交于 4月 27, 2018

Limiting the dma mask to avoid PCI (pre-PCIe) DAC cycles while paying
the huge overhead of an IOMMU is rather pointless, and this seriously
gets in the way of dma mapping work.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

06e9552f

27 5月, 2018 2 次提交

ARM: Fix i2c-gpio GPIO descriptor tables · f59c303b

由 Linus Walleij 提交于 5月 26, 2018

I used bad names in my clumsiness when rewriting many board
files to use GPIO descriptors instead of platform data. A few
had the platform_device ID set to -1 which would indeed give
the device name "i2c-gpio".

But several had it set to >=0 which gives the names
"i2c-gpio.0", "i2c-gpio.1" ...

Fix the offending instances in the ARM tree. Sorry for the
mess.

Fixes: b2e63555 ("i2c: gpio: Convert to use descriptors")
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Simon Guinot <simon.guinot@sequanux.org>
Reported-by: NSimon Guinot <simon.guinot@sequanux.org>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NOlof Johansson <olof@lixom.net>

f59c303b

arm64: dts: hikey: Fix eMMC corruption regression · 9c6d26df

由 John Stultz 提交于 5月 25, 2018

This patch is a partial revert of
commit abd7d097 ("arm64: dts: hikey: Enable HS200 mode on eMMC")

which has been causing eMMC corruption on my HiKey board.

Symptoms usually looked like:

mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
...
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
mmc0: new HS200 MMC card at address 0001
...
dwmmc_k3 f723d000.dwmmc0: Unexpected command timeout, state 3
mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
print_req_error: I/O error, dev mmcblk0, sector 8810504
Aborting journal on device mmcblk0p10-8.
mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0)
EXT4-fs error (device mmcblk0p10): ext4_journal_check_start:61: Detected aborted journal
EXT4-fs (mmcblk0p10): Remounting filesystem read-only

And quite often this would result in a disk that wouldn't properly
boot even with older kernels.

It seems the max-frequency property added by the above patch is
causing the problem, so remove it.

Cc: Ryan Grachek <ryan@edited.us>
Cc: Wei Xu <xuwei5@hisilicon.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: YongQin Liu <yongqin.liu@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Tested-by: NLeo Yan <leo.yan@linaro.org>
Signed-off-by: NWei Xu <xuwei04@gmail.com>

9c6d26df

26 5月, 2018 1 次提交

KVM: x86: fix #UD address of failed Hyper-V hypercalls · 696ca779

由 Radim Krčmář 提交于 5月 24, 2018

If the hypercall was called from userspace or real mode, KVM injects #UD
and then advances RIP, so it looks like #UD was caused by the following
instruction.  This probably won't cause more than confusion, but could
give an unexpected access to guest OS' instruction emulator.

Also, refactor the code to count hv hypercalls that were handled by the
virt userspace.

Fixes: 6356ee0c ("x86: Delay skip of emulated hypercall instruction")
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

696ca779

25 5月, 2018 5 次提交

H
dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs · 884571f0
由 Huaisheng Ye 提交于 5月 25, 2018
```
Signed-off-by: NHuaisheng Ye <yehs1@lenovo.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
884571f0

perf/x86: Store user space frame-pointer value on a sample · 10b11050

由 Alexey Budankov 提交于 5月 24, 2018

Store user space frame-pointer value (BP register) into the perf trace
on a sample for a process so the value becomes available when
unwinding call stacks for functions gaining event samples.
Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/311d4a34-f81b-5535-3385-01427ac73b41@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

10b11050

perf/core: Fix bad use of igrab() · 9511bce9

由 Song Liu 提交于 4月 17, 2018

As Miklos reported and suggested:

 "This pattern repeats two times in trace_uprobe.c and in
  kernel/events/core.c as well:

      ret = kern_path(filename, LOOKUP_FOLLOW, &path);
      if (ret)
          goto fail_address_parse;

      inode = igrab(d_inode(path.dentry));
      path_put(&path);

  And it's wrong.  You can only hold a reference to the inode if you
  have an active ref to the superblock as well (which is normally
  through path.mnt) or holding s_umount.

  This way unmounting the containing filesystem while the tracepoint is
  active will give you the "VFS: Busy inodes after unmount..." message
  and a crash when the inode is finally put.

  Solution: store path instead of inode."

This patch fixes the issue in kernel/event/core.c.
Reviewed-and-tested-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: NMiklos Szeredi <miklos@szeredi.hu>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 375637bc ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubraving@fb.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

9511bce9

Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" · d883c6cf

由 Joonsoo Kim 提交于 5月 23, 2018

This reverts the following commits that change CMA design in MM.

 3d2054ad ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")

 1d47a3ec ("mm/cma: remove ALLOC_CMA")

 bad8c6c0 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")

Ville reported a following error on i386.

  Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
  microcode: microcode updated early to revision 0x4, date = 2013-06-28
  Initializing CPU#0
  Initializing HighMem for node 0 (000377fe:00118000)
  Initializing Movable for node 0 (00000001:00118000)
  BUG: Bad page state in process swapper  pfn:377fe
  page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
  flags: 0x80000000()
  raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
  page dumped because: nonzero mapcount
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
  Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
  Call Trace:
   dump_stack+0x60/0x96
   bad_page+0x9a/0x100
   free_pages_check_bad+0x3f/0x60
   free_pcppages_bulk+0x29d/0x5b0
   free_unref_page_commit+0x84/0xb0
   free_unref_page+0x3e/0x70
   __free_pages+0x1d/0x20
   free_highmem_page+0x19/0x40
   add_highpages_with_active_regions+0xab/0xeb
   set_highmem_pages_init+0x66/0x73
   mem_init+0x1b/0x1d7
   start_kernel+0x17a/0x363
   i386_start_kernel+0x95/0x99
   startup_32_smp+0x164/0x168

The reason for this error is that the span of MOVABLE_ZONE is extended
to whole node span for future CMA initialization, and, normal memory is
wrongly freed here.  I submitted the fix and it seems to work, but,
another problem happened.

It's so late time to fix the later problem so I decide to reverting the
series.
Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: NLaura Abbott <labbott@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d883c6cf

kvm: x86: IA32_ARCH_CAPABILITIES is always supported · 1eaafe91

由 Jim Mattson 提交于 5月 09, 2018

If there is a possibility that a VM may migrate to a Skylake host,
then the hypervisor should report IA32_ARCH_CAPABILITIES.RSBA[bit 2]
as being set (future work, of course). This implies that
CPUID.(EAX=7,ECX=0):EDX.ARCH_CAPABILITIES[bit 29] should be
set. Therefore, kvm should report this CPUID bit as being supported
whether or not the host supports it.  Userspace is still free to clear
the bit if it chooses.

For more information on RSBA, see Intel's white paper, "Retpoline: A
Branch Target Injection Mitigation" (Document Number 337131-001),
currently available at https://bugzilla.kernel.org/show_bug.cgi?id=199511.

Since the IA32_ARCH_CAPABILITIES MSR is emulated in kvm, there is no
dependency on hardware support for this feature.
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fixes: 28c1c9fa ("KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES")
Cc: stable@vger.kernel.org
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

1eaafe91

24 5月, 2018 8 次提交

KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed · c4d21882

由 Wei Huang 提交于 5月 01, 2018

The CPUID bits of OSXSAVE (function=0x1) and OSPKE (func=0x7, leaf=0x0)
allows user apps to detect if OS has set CR4.OSXSAVE or CR4.PKE. KVM is
supposed to update these CPUID bits when CR4 is updated. Current KVM
code doesn't handle some special cases when updates come from emulator.
Here is one example:

  Step 1: guest boots
  Step 2: guest OS enables XSAVE ==> CR4.OSXSAVE=1 and CPUID.OSXSAVE=1
  Step 3: guest hot reboot ==> QEMU reset CR4 to 0, but CPUID.OSXAVE==1
  Step 4: guest os checks CPUID.OSXAVE, detects 1, then executes xgetbv

Step 4 above will cause an #UD and guest crash because guest OS hasn't
turned on OSXAVE yet. This patch solves the problem by comparing the the
old_cr4 with cr4. If the related bits have been changed,
kvm_update_cpuid() needs to be called.
Signed-off-by: NWei Huang <wei@redhat.com>
Reviewed-by: NBandan Das <bsd@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c4d21882

x86/kvm: fix LAPIC timer drift when guest uses periodic mode · d8f2f498

由 David Vrabel 提交于 5月 18, 2018

Since 4.10, commit 8003c9ae (KVM: LAPIC: add APIC Timer
periodic/oneshot mode VMX preemption timer support), guests using
periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
drift significantly over time.

Differences in the underlying clocks and numerical errors means the
periods of the two timers (hv and sw) are not the same. This
difference will accumulate with every expiry resulting in a large
error between the hv and sw timer.

This means the sw timer may be running slow when compared to the hv
timer. When the timer is switched from hv to sw, the now active sw
timer will expire late. The guest VCPU is reentered and it switches to
using the hv timer. This timer catches up, injecting multiple IRQs
into the guest (of which the guest only sees one as it does not get to
run until the hv timer has caught up) and thus the guest's timer rate
is low (and becomes increasing slower over time as the sw timer lags
further and further behind).

I believe a similar problem would occur if the hv timer is the slower
one, but I have not observed this.

Fix this by synchronizing the deadlines for both timers to the same
time source on every tick. This prevents the errors from accumulating.

Fixes: 8003c9ae
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NDavid Vrabel <david.vrabel@nutanix.com>
Cc: stable@vger.kernel.org
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d8f2f498

MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs · c7e81462

由 Maciej W. Rozycki 提交于 5月 16, 2018

Use 64-bit accesses for 64-bit floating-point general registers with
PTRACE_PEEKUSR, removing the truncation of their upper halves in the
FR=1 mode, caused by commit bbd426f5 ("MIPS: Simplify FP context
access"), which inadvertently switched them to using 32-bit accesses.

The PTRACE_POKEUSR side is fine as it's never been broken and continues
using 64-bit accesses.

Fixes: bbd426f5 ("MIPS: Simplify FP context access")
Signed-off-by: NMaciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Patchwork: https://patchwork.linux-mips.org/patch/19334/Signed-off-by: NJames Hogan <jhogan@kernel.org>

c7e81462

MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests · 28e4213d

由 Maciej W. Rozycki 提交于 5月 15, 2018

Having PR_FP_MODE_FRE (i.e. Config5.FRE) set without PR_FP_MODE_FR (i.e.
Status.FR) is not supported as the lone purpose of Config5.FRE is to
emulate Status.FR=0 handling on FPU hardware that has Status.FR=1
hardwired[1][2].  Also we do not handle this case elsewhere, and assume
throughout our code that TIF_HYBRID_FPREGS and TIF_32BIT_FPREGS cannot
be set both at once for a task, leading to inconsistent behaviour if
this does happen.

Return unsuccessfully then from prctl(2) PR_SET_FP_MODE calls requesting
PR_FP_MODE_FRE to be set with PR_FP_MODE_FR clear.  This corresponds to
modes allowed by `mips_set_personality_fp'.

References:

[1] "MIPS Architecture For Programmers, Vol. III: MIPS32 / microMIPS32
    Privileged Resource Architecture", Imagination Technologies,
    Document Number: MD00090, Revision 6.02, July 10, 2015, Table 9.69
    "Config5 Register Field Descriptions", p. 262

[2] "MIPS Architecture For Programmers, Volume III: MIPS64 / microMIPS64
    Privileged Resource Architecture", Imagination Technologies,
    Document Number: MD00091, Revision 6.03, December 22, 2015, Table
    9.72 "Config5 Register Field Descriptions", p. 288

Fixes: 9791554b ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS")
Signed-off-by: NMaciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.0+
Patchwork: https://patchwork.linux-mips.org/patch/19327/Signed-off-by: NJames Hogan <jhogan@kernel.org>

28e4213d

ARM: dts: stm32: Add exti support to stm32mp157 pinctrl · 6a88c221

由 Ludovic Barre 提交于 4月 26, 2018

This patch adds support of external interrupt for
gpio[a..k], gpioz
Signed-off-by: NLudovic Barre <ludovic.barre@st.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

6a88c221

ARM: dts: stm32: Add exti support for stm32mp157c · 5f0e9d25

由 Ludovic Barre 提交于 4月 26, 2018

This patch adds external interrupt (exti) support
on stm32mp157c SoC.
Signed-off-by: NLudovic Barre <ludovic.barre@st.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

5f0e9d25

arm64: Make sure permission updates happen for pmd/pud · 82034c23

由 Laura Abbott 提交于 5月 23, 2018

Commit 15122ee2 ("arm64: Enforce BBM for huge IO/VMAP mappings")
disallowed block mappings for ioremap since that code does not honor
break-before-make. The same APIs are also used for permission updating
though and the extra checks prevent the permission updates from happening,
even though this should be permitted. This results in read-only permissions
not being fully applied. Visibly, this can occasionaly be seen as a failure
on the built in rodata test when the test data ends up in a section or
as an odd RW gap on the page table dump. Fix this by using
pgattr_change_is_safe instead of p*d_present for determining if the
change is permitted.
Reviewed-by: NKees Cook <keescook@chromium.org>
Tested-by: NPeter Robinson <pbrobinson@gmail.com>
Reported-by: NPeter Robinson <pbrobinson@gmail.com>
Fixes: 15122ee2 ("arm64: Enforce BBM for huge IO/VMAP mappings")
Signed-off-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

82034c23

m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap() · 3f90f9ef

由 Michael Schmitz 提交于 5月 14, 2018

If 020/030 support is enabled, get_io_area() leaves an IO_SIZE gap
between mappings which is added to the vm_struct representing the
mapping.  __ioremap() uses the actual requested size (after alignment),
while __iounmap() is passed the size from the vm_struct.

On 020/030, early termination descriptors are used to set up mappings of
extent 'size', which are validated on unmapping. The unmapped gap of
size IO_SIZE defeats the sanity check of the pmd tables, causing
__iounmap() to loop forever on 030.

On 040/060, unmapping of page table entries does not check for a valid
mapping, so the umapping loop always completes there.

Adjust size to be unmapped by the gap that had been added in the
vm_struct prior.

This fixes the hang in atari_platform_init() reported a long time ago,
and a similar one reported by Finn recently (addressed by removing
ioremap() use from the SWIM driver.

Tested on my Falcon in 030 mode - untested but should work the same on
040/060 (the extra page tables cleared there would never have been set
up anyway).
Signed-off-by: NMichael Schmitz <schmitzmic@gmail.com>
[geert: Minor commit description improvements]
[geert: This was fixed in 2.4.23, but not in 2.5.x]
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org

3f90f9ef

23 5月, 2018 6 次提交

x86/speculation: Simplify the CPU bug detection logic · 8ecc4979

由 Dominik Brodowski 提交于 5月 22, 2018

Only CPUs which speculate can speculate. Therefore, it seems prudent
to test for cpu_no_speculation first and only then determine whether
a specific speculating CPU is susceptible to store bypass speculation.
This is underlined by all CPUs currently listed in cpu_no_speculation
were present in cpu_no_spec_store_bypass as well.
Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Cc: konrad.wilk@oracle.com
Link: https://lkml.kernel.org/r/20180522090539.GA24668@light.dominikbrodowski.net

8ecc4979

KVM/VMX: Expose SSBD properly to guests · 0aa48468

由 Konrad Rzeszutek Wilk 提交于 5月 21, 2018

The X86_FEATURE_SSBD is an synthetic CPU feature - that is
it bit location has no relevance to the real CPUID 0x7.EBX[31]
bit position. For that we need the new CPU feature name.

Fixes: 52817587 ("x86/cpufeatures: Disentangle SSBD enumeration")
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20180521215449.26423-2-konrad.wilk@oracle.com

0aa48468

nds32: Fix compiler warning, Wstringop-overflow, in vdso.c · a30e7d1e

由 Vincent Chen 提交于 5月 21, 2018

Getting a compiler warning, Wstringop-overflow, in
arch/nds32/kernel/vdso.c when kernel is built by gcc-8. Declaring
vdso_start and vdso_end as a pointer to fix this compiler warning.
Signed-off-by: NVincent Chen <vincentc@andestech.com>
Reviewed-by: NGreentime Hu <greentime@andestech.com>
Signed-off-by: NGreentime Hu <greentime@andestech.com>

a30e7d1e

nds32: Disable local irq before calling cpu_dcache_wb_page in copy_user_highpage · aaaaba57

由 Vincent Chen 提交于 5月 14, 2018

In order to ensure that all data in source page has been written back
to memory before copy_page, the local irq shall be disabled before
calling cpu_dcache_wb_page(). In addition, removing unneeded page
invalidation for 'to' page.
Signed-off-by: NVincent Chen <vincentc@andestech.com>
Reviewed-by: NGreentime Hu <greentime@andestech.com>
Signed-off-by: NGreentime Hu <greentime@andestech.com>

aaaaba57

nds32: Flush the cache of the page at vmaddr instead of kaddr in flush_anon_page · 5b9f9569

由 Vincent Chen 提交于 5月 14, 2018

According to Documentation/cachetlb.txt, the cache of the page at vmaddr
shall be flushed in flush_anon_page instead of the cache of the page at
page_address(page).
Signed-off-by: NVincent Chen <vincentc@andestech.com>
Reviewed-by: NGreentime Hu <greentime@andestech.com>
Signed-off-by: NGreentime Hu <greentime@andestech.com>

5b9f9569

nds32: Correct flush_dcache_page function · efcc4ea8

由 Vincent Chen 提交于 4月 24, 2018

1. Disable local irq before d-cache write-back and invalidate.
   The cpu_dcache_wbinval_page function is composed of d-cache
write-back and invalidate. If the local irq is enabled when calling
cpu_dcache_wbinval_page, the content of d-cache is possibly updated
between write-back and invalidate. In this case, the updated data will
be dropped due to the following d-cache invalidation. Therefore, we
disable the local irq before calling cpu_dcache_wbinval_page.

2. Correct the data write-back for page aliasing case.
   Only the page whose (page->index << PAGE_SHIFT) is located at the
same page color as page_address(page) needs to execute data write-back
in flush_dcache_page function.
Signed-off-by: NVincent Chen <vincentc@andestech.com>
Reviewed-by: NGreentime Hu <greentime@andestech.com>
Signed-off-by: NGreentime Hu <greentime@andestech.com>

efcc4ea8

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功