提交 · c1f90eb019714c51f191f2c1abbfda7ebfaf7c35 · openanolis / cloud-kernel

10 1月, 2019 40 次提交

btrfs: dev-replace: go back to suspend state if another EXCL_OP is running · c1f90eb0

由 Anand Jain 提交于 11月 11, 2018

commit 05c49e6bc1e8866ecfd674ebeeb58cdbff9145c2 upstream.

In a secnario where balance and replace co-exists as below,

  - start balance
  - pause balance
  - start replace
  - reboot

and when system restarts, balance resumes first. Then the replace is
attempted to restart but will fail as the EXCL_OP lock is already held
by the balance. If so place the replace state back to
BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state.

Fixes: 010a47bd ("btrfs: add proper safety check before resuming dev-replace")
CC: stable@vger.kernel.org # 4.18+
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c1f90eb0

btrfs: dev-replace: go back to suspended state if target device is missing · 28867a52

由 Anand Jain 提交于 11月 11, 2018

commit 0d228ece59a35a9b9e8ff0d40653234a6d90f61e upstream.

At the time of forced unmount we place the running replace to
BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state, so when the system comes
back and expect the target device is missing.

Then let the replace state continue to be in
BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state instead of
BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED as there isn't any matching scrub
running as part of replace.

Fixes: e93c89c1 ("Btrfs: add new sources for device replace code")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

28867a52

cdc-acm: fix abnormal DATA RX issue for Mediatek Preloader. · 326ca6bd

由 Macpaul Lin 提交于 12月 19, 2018

commit eafb27fa5283599ce6c5492ea18cf636a28222bb upstream.

Mediatek Preloader is a proprietary embedded boot loader for loading
Little Kernel and Linux into device DRAM.

This boot loader also handle firmware update. Mediatek Preloader will be
enumerated as a virtual COM port when the device is connected to Windows
or Linux OS via CDC-ACM class driver. When the USB enumeration has been
done, Mediatek Preloader will send out handshake command "READY" to PC
actively instead of waiting command from the download tool.

Since Linux 4.12, the commit "tty: reset termios state on device
registration" (93857edd) causes Mediatek
Preloader receiving some abnoraml command like "READYXX" as it sent.
This will be recognized as an incorrect response. The behavior change
also causes the download handshake fail. This change only affects
subsequent connects if the reconnected device happens to get the same minor
number.

By disabling the ECHO termios flag could avoid this problem. However, it
cannot be done by user space configuration when download tool open
/dev/ttyACM0. This is because the device running Mediatek Preloader will
send handshake command "READY" immediately once the CDC-ACM driver is
ready.

This patch wants to fix above problem by introducing "DISABLE_ECHO"
property in driver_info. When Mediatek Preloader is connected, the
CDC-ACM driver could disable ECHO flag in termios to avoid the problem.
Signed-off-by: NMacpaul Lin <macpaul.lin@mediatek.com>
Cc: stable@vger.kernel.org
Reviewed-by: NJohan Hovold <johan@kernel.org>
Acked-by: NOliver Neukum <oneukum@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

326ca6bd

cgroup: fix CSS_TASK_ITER_PROCS · 8a2fbdd5

由 Tejun Heo 提交于 11月 08, 2018

commit e9d81a1bc2c48ea9782e3e8b53875f419766ef47 upstream.

CSS_TASK_ITER_PROCS implements process-only iteration by making
css_task_iter_advance() skip tasks which aren't threadgroup leaders;
however, when an iteration is started css_task_iter_start() calls the
inner helper function css_task_iter_advance_css_set() instead of
css_task_iter_advance().  As the helper doesn't have the skip logic,
when the first task to visit is a non-leader thread, it doesn't get
skipped correctly as shown in the following example.

  # ps -L 2030
    PID   LWP TTY      STAT   TIME COMMAND
   2030  2030 pts/0    Sl+    0:00 ./test-thread
   2030  2031 pts/0    Sl+    0:00 ./test-thread
  # mkdir -p /sys/fs/cgroup/x/a/b
  # echo threaded > /sys/fs/cgroup/x/a/cgroup.type
  # echo threaded > /sys/fs/cgroup/x/a/b/cgroup.type
  # echo 2030 > /sys/fs/cgroup/x/a/cgroup.procs
  # cat /sys/fs/cgroup/x/a/cgroup.threads
  2030
  2031
  # cat /sys/fs/cgroup/x/cgroup.procs
  2030
  # echo 2030 > /sys/fs/cgroup/x/a/b/cgroup.threads
  # cat /sys/fs/cgroup/x/cgroup.procs
  2031
  2030

The last read of cgroup.procs is incorrectly showing non-leader 2031
in cgroup.procs output.

This can be fixed by updating css_task_iter_advance() to handle the
first advance and css_task_iters_tart() to call
css_task_iter_advance() instead of the inner helper.  After the fix,
the same commands result in the following (correct) result:

  # ps -L 2062
    PID   LWP TTY      STAT   TIME COMMAND
   2062  2062 pts/0    Sl+    0:00 ./test-thread
   2062  2063 pts/0    Sl+    0:00 ./test-thread
  # mkdir -p /sys/fs/cgroup/x/a/b
  # echo threaded > /sys/fs/cgroup/x/a/cgroup.type
  # echo threaded > /sys/fs/cgroup/x/a/b/cgroup.type
  # echo 2062 > /sys/fs/cgroup/x/a/cgroup.procs
  # cat /sys/fs/cgroup/x/a/cgroup.threads
  2062
  2063
  # cat /sys/fs/cgroup/x/cgroup.procs
  2062
  # echo 2062 > /sys/fs/cgroup/x/a/b/cgroup.threads
  # cat /sys/fs/cgroup/x/cgroup.procs
  2062
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: N"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Fixes: 8cfd8147 ("cgroup: implement cgroup v2 thread support")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

8a2fbdd5

crypto: cfb - fix decryption · 99dcd45f

由 Dmitry Eremin-Solenikov 提交于 10月 20, 2018

commit fa4600734b74f74d9169c3015946d4722f8bcf79 upstream.

crypto_cfb_decrypt_segment() incorrectly XOR'ed generated keystream with
IV, rather than with data stream, resulting in incorrect decryption.
Test vectors will be added in the next patch.
Signed-off-by: NDmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

99dcd45f

crypto: testmgr - add AES-CFB tests · d8e4b24f

由 Dmitry Eremin-Solenikov 提交于 10月 20, 2018

commit 7da66670775d201f633577f5b15a4bbeebaaa2b0 upstream.

Add AES128/192/256-CFB testvectors from NIST SP800-38A.
Signed-off-by: NDmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d8e4b24f

crypto: chcr - small packet Tx stalls the queue · cc43a8af

由 Atul Gupta 提交于 11月 30, 2018

commit c35828ea906a7c76632a0211e59c392903cd4615 upstream.

Immediate packets sent to hardware should include the work
request length in calculating the flits. WR occupy one flit and
if not accounted result in invalid request which stalls the HW
queue.

Cc: stable@vger.kernel.org
Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cc43a8af

crypto: cavium/nitrox - fix a DMA pool free failure · 0fa6bead

由 Wenwen Wang 提交于 10月 18, 2018

commit 7172122be6a4712d699da4d261f92aa5ab3a78b8 upstream.

In crypto_alloc_context(), a DMA pool is allocated through dma_pool_alloc()
to hold the crypto context. The meta data of the DMA pool, including the
pool used for the allocation 'ndev->ctx_pool' and the base address of the
DMA pool used by the device 'dma', are then stored to the beginning of the
pool. These meta data are eventually used in crypto_free_context() to free
the DMA pool through dma_pool_free(). However, given that the DMA pool can
also be accessed by the device, a malicious device can modify these meta
data, especially when the device is controlled to deploy an attack. This
can cause an unexpected DMA pool free failure.

To avoid the above issue, this patch introduces a new structure
crypto_ctx_hdr and a new field chdr in the structure nitrox_crypto_ctx hold
the meta data information of the DMA pool after the allocation. Note that
the original structure ctx_hdr is not changed to ensure the compatibility.

Cc: <stable@vger.kernel.org>
Signed-off-by: NWenwen Wang <wang6495@umn.edu>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0fa6bead

clk: sunxi-ng: Use u64 for calculation of NM rate · d095e1ba

由 Jernej Skrabec 提交于 11月 04, 2018

commit 65b6657672388b72822e0367f06d41c1e3ffb5bb upstream.

Allwinner H6 SoC has multiplier N range between 1 and 254. Since parent
rate is 24MHz, intermediate result when calculating final rate easily
overflows 32 bit variable.

Because of that, introduce function for calculating clock rate which
uses 64 bit variable for intermediate result.

Fixes: 6174a1e2 ("clk: sunxi-ng: Add N-M-factor clock support")
Fixes: ee28648c ("clk: sunxi-ng: Remove the use of rational computations")

CC: <stable@vger.kernel.org>
Signed-off-by: NJernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: NMaxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d095e1ba

clk: rockchip: fix typo in rk3188 spdif_frac parent · 36ef9d14

由 Johan Jonker 提交于 11月 03, 2018

commit 8b19faf6fae2867e2c177212c541e8ae36aa4d32 upstream.

Fix typo in common_clk_branches.
Make spdif_pre parent of spdif_frac.

Fixes: 66746420 ("clk: rockchip: include downstream muxes into fractional dividers")
Cc: stable@vger.kernel.org
Signed-off-by: NJohan Jonker <jbx9999@hotmail.com>
Acked-by: NElaine Zhang <zhangqing@rock-chips.com>
Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

36ef9d14

spi: bcm2835: Avoid finishing transfer prematurely in IRQ mode · 9e9c6698

由 Lukas Wunner 提交于 11月 08, 2018

commit 56c1723426d3cfd4723bfbfce531d7b38bae6266 upstream.

The IRQ handler bcm2835_spi_interrupt() first reads as much as possible
from the RX FIFO, then writes as much as possible to the TX FIFO.
Afterwards it decides whether the transfer is finished by checking if
the TX FIFO is empty.

If very few bytes were written to the TX FIFO, they may already have
been transmitted by the time the FIFO's emptiness is checked.  As a
result, the transfer will be declared finished and the chip will be
reset without reading the corresponding received bytes from the RX FIFO.

The odds of this happening increase with a high clock frequency (such
that the TX FIFO drains quickly) and either passing "threadirqs" on the
command line or enabling CONFIG_PREEMPT_RT_BASE (such that the IRQ
handler may be preempted between filling the TX FIFO and checking its
emptiness).

Fix by instead checking whether rx_len has reached zero, which means
that the transfer has been received in full.  This is also more
efficient as it avoids one bus read access per interrupt.  Note that
bcm2835_spi_transfer_one_poll() likewise uses rx_len to determine
whether the transfer has finished.
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Fixes: e34ff011 ("spi: bcm2835: move to the transfer_one driver model")
Cc: stable@vger.kernel.org # v4.1+
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9e9c6698

spi: bcm2835: Fix book-keeping of DMA termination · cc8b83ff

由 Lukas Wunner 提交于 11月 08, 2018

commit dbc944115eed48af110646992893dc43321368d8 upstream.

If submission of a DMA TX transfer succeeds but submission of the
corresponding RX transfer does not, the BCM2835 SPI driver terminates
the TX transfer but neglects to reset the dma_pending flag to false.

Thus, if the next transfer uses interrupt mode (because it is shorter
than BCM2835_SPI_DMA_MIN_LENGTH) and runs into a timeout,
dmaengine_terminate_all() will be called both for TX (once more) and
for RX (which was never started in the first place).  Fix it.
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Fixes: 3ecd37ed ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
Cc: stable@vger.kernel.org # v4.2+
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cc8b83ff

spi: bcm2835: Fix race on DMA termination · 63f97d30

由 Lukas Wunner 提交于 11月 08, 2018

commit e82b0b3828451c1cd331d9f304c6078fcd43b62e upstream.

If a DMA transfer finishes orderly right when spi_transfer_one_message()
determines that it has timed out, the callbacks bcm2835_spi_dma_done()
and bcm2835_spi_handle_err() race to call dmaengine_terminate_all(),
potentially leading to double termination.

Prevent by atomically changing the dma_pending flag before calling
dmaengine_terminate_all().
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Fixes: 3ecd37ed ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
Cc: stable@vger.kernel.org # v4.2+
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

63f97d30

ext4: check for shutdown and r/o file system in ext4_write_inode() · 0cb4f655

由 Theodore Ts'o 提交于 12月 19, 2018

commit 18f2c4fcebf2582f96cbd5f2238f4f354a0e4847 upstream.

If the file system has been shut down or is read-only, then
ext4_write_inode() needs to bail out early.

Also use jbd2_complete_transaction() instead of ext4_force_commit() so
we only force a commit if it is needed.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0cb4f655

ext4: force inode writes when nfsd calls commit_metadata() · bf2fd1f9

由 Theodore Ts'o 提交于 12月 19, 2018

commit fde872682e175743e0c3ef939c89e3c6008a1529 upstream.

Some time back, nfsd switched from calling vfs_fsync() to using a new
commit_metadata() hook in export_operations().  If the file system did
not provide a commit_metadata() hook, it fell back to using
sync_inode_metadata().  Unfortunately doesn't work on all file
systems.  In particular, it doesn't work on ext4 due to how the inode
gets journalled --- the VFS writeback code will not always call
ext4_write_inode().

So we need to provide our own ext4_nfs_commit_metdata() method which
calls ext4_write_inode() directly.

Google-Bug-Id: 121195940
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bf2fd1f9

ext4: avoid declaring fs inconsistent due to invalid file handles · 26366388

由 Theodore Ts'o 提交于 12月 19, 2018

commit 8a363970d1dc38c4ec4ad575c862f776f468d057 upstream.

If we receive a file handle, either from NFS or open_by_handle_at(2),
and it points at an inode which has not been initialized, and the file
system has metadata checksums enabled, we shouldn't try to get the
inode, discover the checksum is invalid, and then declare the file
system as being inconsistent.

This can be reproduced by creating a test file system via "mke2fs -t
ext4 -O metadata_csum /tmp/foo.img 8M", mounting it, cd'ing into that
directory, and then running the following program.

#define _GNU_SOURCE
#include <fcntl.h>

struct handle {
	struct file_handle fh;
	unsigned char fid[MAX_HANDLE_SZ];
};

int main(int argc, char **argv)
{
	struct handle h = {{8, 1 }, { 12, }};

	open_by_handle_at(AT_FDCWD, &h.fh, O_RDONLY);
	return 0;
}

Google-Bug-Id: 120690101
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

26366388

ext4: include terminating u32 in size of xattr entries when expanding inodes · 6633fcb2

由 Theodore Ts'o 提交于 12月 19, 2018

commit a805622a757b6d7f65def4141d29317d8e37b8a1 upstream.

In ext4_expand_extra_isize_ea(), we calculate the total size of the
xattr header, plus the xattr entries so we know how much of the
beginning part of the xattrs to move when expanding the inode extra
size.  We need to include the terminating u32 at the end of the xattr
entries, or else if there is uninitialized, non-zero bytes after the
xattr entries and before the xattr values, the list of xattr entries
won't be properly terminated.
Reported-by: NSteve Graham <stgraham2000@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6633fcb2

ext4: fix EXT4_IOC_GROUP_ADD ioctl · 11bb168b

由 ruippan (潘睿) 提交于 12月 04, 2018

commit e647e29196b7f802f8242c39ecb7cc937f5ef217 upstream.

Commit e2b911c5 ("ext4: clean up feature test macros with
predicate functions") broke the EXT4_IOC_GROUP_ADD ioctl.  This was
not noticed since only very old versions of resize2fs (before
e2fsprogs 1.42) use this ioctl.  However, using a new kernel with an
enterprise Linux userspace will cause attempts to use online resize to
fail with "No reserved GDT blocks".

Fixes: e2b911c5 ("ext4: clean up feature test macros with predicate...")
Cc: stable@kernel.org # v4.4
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: Nruippan (潘睿) <ruippan@tencent.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

11bb168b

ext4: missing unlock/put_page() in ext4_try_to_write_inline_data() · 0d078853

由 Maurizio Lombardi 提交于 12月 04, 2018

commit 132d00becb31e88469334e1e62751c81345280e0 upstream.

In case of error, ext4_try_to_write_inline_data() should unlock
and release the page it holds.

Fixes: f19d5870 ("ext4: add normal write support for inline data")
Cc: stable@kernel.org # 3.8
Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0d078853

ext4: fix possible use after free in ext4_quota_enable · 0a1c177d

由 Pan Bian 提交于 12月 03, 2018

commit 61157b24e60fb3cd1f85f2c76a7b1d628f970144 upstream.

The function frees qf_inode via iput but then pass qf_inode to
lockdep_set_quota_inode on the failure path. This may result in a
use-after-free bug. The patch frees df_inode only when it is never used.

Fixes: daf647d2 ("ext4: add lockdep annotations for i_data_sem")
Cc: stable@kernel.org # 4.6
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NPan Bian <bianpan2016@163.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0a1c177d

ext4: add ext4_sb_bread() to disambiguate ENOMEM cases · b878c8a7

由 Theodore Ts'o 提交于 11月 25, 2018

commit fb265c9cb49e2074ddcdd4de99728aefdd3b3592 upstream.

Today, when sb_bread() returns NULL, this can either be because of an
I/O error or because the system failed to allocate the buffer.  Since
it's an old interface, changing would require changing many call
sites.

So instead we create our own ext4_sb_bread(), which also allows us to
set the REQ_META flag.

Also fixed a problem in the xattr code where a NULL return in a
function could also mean that the xattr was not found, which could
lead to the wrong error getting returned to userspace.

Fixes: ac27a0ec ("ext4: initial copy of files from ext3")
Cc: stable@kernel.org # 2.6.19
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b878c8a7

ocxl: Fix endiannes bug in read_afu_name() · 6665481e

由 Greg Kurz 提交于 12月 11, 2018

commit 2f07229f02d4c55affccd11a61af4fd4b94dc436 upstream.

The AFU Descriptor Template in the PCI config space has a Name Space
field which is a 24 Byte ASCII character string of descriptive name
space for the AFU. The OCXL driver read the string four characters at
a time with pci_read_config_dword().

This optimization is valid on a little-endian system since this is PCI,
but a big-endian system ends up with each subset of four characters in
reverse order.

This could be fixed by switching to read characters one by one. Another
option is to swap the bytes if we're big-endian.

Go for the latter with le32_to_cpu().

Cc: stable@vger.kernel.org      # v4.16
Signed-off-by: NGreg Kurz <groug@kaod.org>
Acked-by: NFrederic Barrat <fbarrat@linux.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6665481e

ocxl: Fix endiannes bug in ocxl_link_update_pe() · 3fbf78b2

由 Greg Kurz 提交于 12月 16, 2018

commit e1e71e201703500f708bdeaf64660a2a178cb6a0 upstream.

All fields in the PE are big-endian. Use cpu_to_be32() like everywhere
else something is written to the PE. Otherwise a wrong TID will be used
by the NPU. If this TID happens to point to an existing thread sharing
the same mm, it could be woken up by error. This is highly improbable
though. The likely outcome of this is the NPU not finding the target
thread and forcing the AFU into sending an interrupt, which userspace
is supposed to handle anyway.

Fixes: e948e06f ("ocxl: Expose the thread_id needed for wait on POWER9")
Cc: stable@vger.kernel.org      # v4.18
Signed-off-by: NGreg Kurz <groug@kaod.org>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3fbf78b2

perf env: Also consider env->arch == NULL as local operation · 65e4e67d

由 Arnaldo Carvalho de Melo 提交于 11月 27, 2018

commit 804234f27180dcf9a25cb98a88d5212f65b7f3fd upstream.

We'll set a new machine field based on env->arch, which for live mode,
like with 'perf top' means we need to use uname() to figure the name of
the arch, fix perf_env__arch() to consider both (env == NULL) and
(env->arch == NULL) as local operation.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # 4.19
Link: https://lkml.kernel.org/n/tip-vcz4ufzdon7cwy8dm2ua53xk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

65e4e67d

perf pmu: Suppress potential format-truncation warning · d124dd5c

由 Ben Hutchings 提交于 11月 11, 2018

commit 11a64a05dc649815670b1be9fe63d205cb076401 upstream.

Depending on which functions are inlined in util/pmu.c, the snprintf()
calls in perf_pmu__parse_{scale,unit,per_pkg,snapshot}() might trigger a
warning:

  util/pmu.c: In function 'pmu_aliases':
  util/pmu.c:178:31: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
    snprintf(path, PATH_MAX, "%s/%s.unit", dir, name);
                               ^~

I found this when trying to build perf from Linux 3.16 with gcc 8.
However I can reproduce the problem in mainline if I force
__perf_pmu__new_alias() to be inlined.

Suppress this by using scnprintf() as has been done elsewhere in perf.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20181111184524.fux4taownc6ndbx6@decadent.org.ukSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d124dd5c

perf script: Use fallbacks for branch stacks · 307dbd38

由 Adrian Hunter 提交于 11月 06, 2018

commit 692d0e63324d2954a0c63a812a8588e97023a295 upstream.

Branch stacks do not necessarily have the same cpumode as the 'ip'. Use
the fallback functions in those cases.

This patch depends on patch "perf tools: Add fallback functions for cases
where cpumode is insufficient".
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181106210712.12098-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

307dbd38

perf tools: Use fallback for sample_addr_correlates_sym() cases · 39dad822

由 Adrian Hunter 提交于 11月 06, 2018

commit 225f99e0c811e23836c4911a2ff147e167dd1fe8 upstream.

thread__resolve() is used in the sample_addr_correlates_sym() cases
where 'addr' is a destination of a branch which does not necessarily
have the same cpumode as the 'ip'. Use the fallback function in that
case.

This patch depends on patch "perf tools: Add fallback functions for
cases where cpumode is insufficient".
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181106210712.12098-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

39dad822

perf thread: Add fallback functions for cases where cpumode is insufficient · 0ada27a7

由 Adrian Hunter 提交于 11月 06, 2018

commit 8e80ad9983caeee09c3a0a1a37e05bff93becce4 upstream.

For branch stacks or branch samples, the sample cpumode might not be
correct because it applies only to the sample 'ip' and not necessary to
'addr' or branch stack addresses. Add fallback functions that can be
used to deal with those cases
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0ada27a7

perf machine: Record if a arch has a single user/kernel address space · 62977a9b

由 Adrian Hunter 提交于 11月 06, 2018

commit ec1891afae740be581ecf5abc8bda74c4549203f upstream.

Some architectures have a single address space for kernel and user
addresses, which makes it possible to determine if an address is in
kernel space or user space. Some don't, e.g.: sparc.

Cache that info in perf_env so that, for instance, code needing to
fallback failed symbol lookups at the kernel space in single address
space arches can lookup at userspace.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com
[ split from a larger patch ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

62977a9b

clocksource/drivers/arc_timer: Utilize generic sched_clock · bf75d938

由 Alexey Brodkin 提交于 11月 19, 2018

commit bf287607c80f24387fedb431a346dc67f25be12c upstream.

It turned out we used to use default implementation of sched_clock()
from kernel/sched/clock.c which was as precise as 1/HZ, i.e.
by default we had 10 msec granularity of time measurement.

Now given ARC built-in timers are clocked with the same frequency as
CPU cores we may get much higher precision of time tracking.

Thus we switch to generic sched_clock which really reads ARC hardware
counters.

This is especially helpful for measuring short events.
That's what we used to have:
------------------------------>8------------------------
$ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null

 Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello':

         10.000000      task-clock (msec)         #    2.832 CPUs utilized
                 1      context-switches          #    0.100 K/sec
                 1      cpu-migrations            #    0.100 K/sec
                63      page-faults               #    0.006 M/sec
           3049480      cycles                    #    0.305 GHz
           1091259      instructions              #    0.36  insn per cycle
            256828      branches                  #   25.683 M/sec
             27026      branch-misses             #   10.52% of all branches

       0.003530687 seconds time elapsed

       0.000000000 seconds user
       0.010000000 seconds sys
------------------------------>8------------------------

And now we'll see:
------------------------------>8------------------------
$ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null

 Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello':

          3.004322      task-clock (msec)         #    0.865 CPUs utilized
                 1      context-switches          #    0.333 K/sec
                 1      cpu-migrations            #    0.333 K/sec
                63      page-faults               #    0.021 M/sec
           2986734      cycles                    #    0.994 GHz
           1087466      instructions              #    0.36  insn per cycle
            255209      branches                  #   84.947 M/sec
             26002      branch-misses             #   10.19% of all branches

       0.003474829 seconds time elapsed

       0.003519000 seconds user
       0.000000000 seconds sys
------------------------------>8------------------------

Note how much more meaningful is the second output - time spent for
execution pretty much matches number of cycles spent (we're runnign
@ 1GHz here).
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Acked-by: NVineet Gupta <vgupta@synopsys.com>
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bf75d938

DRM: UDL: get rid of useless vblank initialization · ca3a6fd2

由 Eugeniy Paltsev 提交于 9月 28, 2018

commit 32e932e37e6b6e13b66add307192c7ddd40a781d upstream.

UDL doesn't support vblank functionality so we don't need to
initialize vblank here (we are able to send page flip
completion events even without vblank initialization)

Moreover current drm_vblank_init call with num_crtcs > 0 causes
sending DRM_EVENT_FLIP_COMPLETE event with zero timestamp every
time. This breaks userspace apps (for example weston) which
relies on timestamp value.

Cc: stable@vger.kernel.org
Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180928144126.21598-1-Eugeniy.Paltsev@synopsys.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ca3a6fd2

drm/v3d: Skip debugfs dumping GCA on platforms without GCA. · 29ac2218

由 Eric Anholt 提交于 9月 28, 2018

commit 2f20fa8d12e859a03f68bdd81d75830141bc9ac9 upstream.

Fixes an oops reading this debugfs entry on BCM7278.
Signed-off-by: NEric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180928232126.4332-4-eric@anholt.net
Fixes: 57692c94 ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Cc: <stable@vger.kernel.org>
Reviewed-by: NBoris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

29ac2218

platform-msi: Free descriptors in platform_msi_domain_free() · 6c56e89e

由 Miquel Raynal 提交于 10月 11, 2018

commit 81b1e6e6a8590a19257e37a1633bec098d499c57 upstream.

Since the addition of platform MSI support, there were two helpers
supposed to allocate/free IRQs for a device:

    platform_msi_domain_alloc_irqs()
    platform_msi_domain_free_irqs()

In these helpers, IRQ descriptors are allocated in the "alloc" routine
while they are freed in the "free" one.

Later, two other helpers have been added to handle IRQ domains on top
of MSI domains:

    platform_msi_domain_alloc()
    platform_msi_domain_free()

Seen from the outside, the logic is pretty close with the former
helpers and people used it with the same logic as before: a
platform_msi_domain_alloc() call should be balanced with a
platform_msi_domain_free() call. While this is probably what was
intended to do, the platform_msi_domain_free() does not remove/free
the IRQ descriptor(s) created/inserted in
platform_msi_domain_alloc().

One effect of such situation is that removing a module that requested
an IRQ will let one orphaned IRQ descriptor (with an allocated MSI
entry) in the device descriptors list. Next time the module will be
inserted back, one will observe that the allocation will happen twice
in the MSI domain, one time for the remaining descriptor, one time for
the new one. It also has the side effect to quickly overshoot the
maximum number of allocated MSI and then prevent any module requesting
an interrupt in the same domain to be inserted anymore.

This situation has been met with loops of insertion/removal of the
mvpp2.ko module (requesting 15 MSIs each time).

Fixes: 552c494a ("platform-msi: Allow creation of a MSI-based stacked irq domain")
Cc: stable@vger.kernel.org
Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6c56e89e

KVM: nVMX: Free the VMREAD/VMWRITE bitmaps if alloc_kvm_area() fails · c9dae887

由 Sean Christopherson 提交于 12月 03, 2018

commit 1b3ab5ad1b8ad99bae76ec583809c5f5a31c707c upstream.

Fixes: 34a1cd60 ("kvm: x86: vmx: move some vmx setting from vmx_init() to hardware_setup()")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c9dae887

arm64: KVM: Make VHE Stage-2 TLB invalidation operations non-interruptible · 07cbcfc3

由 Marc Zyngier 提交于 12月 06, 2018

commit c987876a80e7bcb98a839f10dca9ce7fda4feced upstream.

Contrary to the non-VHE version of the TLB invalidation helpers, the VHE
code  has interrupts enabled, meaning that we can take an interrupt in
the middle of such a sequence, and start running something else with
HCR_EL2.TGE cleared.

That's really not a good idea.

Take the heavy-handed option and disable interrupts in
__tlb_switch_to_guest_vhe, restoring them in __tlb_switch_to_host_vhe.
The latter also gain an ISB in order to make sure that TGE really has
taken effect.

Cc: stable@vger.kernel.org
Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

07cbcfc3

KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup · edcf33b1

由 Sean Christopherson 提交于 12月 20, 2018

commit e81434995081fd7efb755fd75576b35dbb0850b1 upstream.

____kvm_handle_fault_on_reboot() provides a generic exception fixup
handler that is used to cleanly handle faults on VMX/SVM instructions
during reboot (or at least try to).  If there isn't a reboot in
progress, ____kvm_handle_fault_on_reboot() treats any exception as
fatal to KVM and invokes kvm_spurious_fault(), which in turn generates
a BUG() to get a stack trace and die.

When it was originally added by commit 4ecac3fd ("KVM: Handle
virtualization instruction #UD faults during reboot"), the "call" to
kvm_spurious_fault() was handcoded as PUSH+JMP, where the PUSH'd value
is the RIP of the faulting instructing.

The PUSH+JMP trickery is necessary because the exception fixup handler
code lies outside of its associated function, e.g. right after the
function.  An actual CALL from the .fixup code would show a slightly
bogus stack trace, e.g. an extra "random" function would be inserted
into the trace, as the return RIP on the stack would point to no known
function (and the unwinder will likely try to guess who owns the RIP).

Unfortunately, the JMP was replaced with a CALL when the macro was
reworked to not spin indefinitely during reboot (commit b7c4145b
"KVM: Don't spin on virt instruction faults during reboot").  This
causes the aforementioned behavior where a bogus function is inserted
into the stack trace, e.g. my builds like to blame free_kvm_area().

Revert the CALL back to a JMP.  The changelog for commit b7c4145b
("KVM: Don't spin on virt instruction faults during reboot") contains
nothing that indicates the switch to CALL was deliberate.  This is
backed up by the fact that the PUSH <insn RIP> was left intact.

Note that an alternative to the PUSH+JMP magic would be to JMP back
to the "real" code and CALL from there, but that would require adding
a JMP in the non-faulting path to avoid calling kvm_spurious_fault()
and would add no value, i.e. the stack trace would be the same.

Using CALL:

------------[ cut here ]------------
kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356!
invalid opcode: 0000 [#1] SMP
CPU: 4 PID: 1057 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm]
Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41
RSP: 0018:ffffc900004bbcc8 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888273fd8000 R08: 00000000000003e8 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000371fb0
R13: 0000000000000000 R14: 000000026d763cf4 R15: ffff888273fd8000
FS:  00007f3d69691700(0000) GS:ffff888277800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f89bc56fe0 CR3: 0000000271a5a001 CR4: 0000000000362ee0
Call Trace:
 free_kvm_area+0x1044/0x43ea [kvm_intel]
 ? vmx_vcpu_run+0x156/0x630 [kvm_intel]
 ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm]
 ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm]
 ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm]
 ? __set_task_blocked+0x38/0x90
 ? __set_current_blocked+0x50/0x60
 ? __fpu__restore_sig+0x97/0x490
 ? do_vfs_ioctl+0xa1/0x620
 ? __x64_sys_futex+0x89/0x180
 ? ksys_ioctl+0x66/0x70
 ? __x64_sys_ioctl+0x16/0x20
 ? do_syscall_64+0x4f/0x100
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc
---[ end trace 9775b14b123b1713 ]---

Using JMP:

------------[ cut here ]------------
kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356!
invalid opcode: 0000 [#1] SMP
CPU: 6 PID: 1067 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm]
Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41
RSP: 0018:ffffc90000497cd0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88827058bd40 R08: 00000000000003e8 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000369fb0
R13: 0000000000000000 R14: 00000003c8fc6642 R15: ffff88827058bd40
FS:  00007f3d7219e700(0000) GS:ffff888277900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3d64001000 CR3: 0000000271c6b004 CR4: 0000000000362ee0
Call Trace:
 vmx_vcpu_run+0x156/0x630 [kvm_intel]
 ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm]
 ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm]
 ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm]
 ? __set_task_blocked+0x38/0x90
 ? __set_current_blocked+0x50/0x60
 ? __fpu__restore_sig+0x97/0x490
 ? do_vfs_ioctl+0xa1/0x620
 ? __x64_sys_futex+0x89/0x180
 ? ksys_ioctl+0x66/0x70
 ? __x64_sys_ioctl+0x16/0x20
 ? do_syscall_64+0x4f/0x100
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc
---[ end trace f9daedb85ab3ddba ]---

Fixes: b7c4145b ("KVM: Don't spin on virt instruction faults during reboot")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

edcf33b1

x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() · 49102719

由 Dan Williams 提交于 12月 04, 2018

commit ba6f508d0ec4adb09f0a939af6d5e19cdfa8667d upstream.

Commit:

  f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()"

addressed a case where __flush_tlb_all() is called without preemption
being disabled. It also left a warning to catch other cases where
preemption is not disabled.

That warning triggers for the memory hotplug path which is also used for
persistent memory enabling:

 WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460
 RIP: 0010:__flush_tlb_all+0x1b/0x3a
 [..]
 Call Trace:
  phys_pud_init+0x29c/0x2bb
  kernel_physical_mapping_init+0xfc/0x219
  init_memory_mapping+0x1a5/0x3b0
  arch_add_memory+0x2c/0x50
  devm_memremap_pages+0x3aa/0x610
  pmem_attach_disk+0x585/0x700 [nd_pmem]

Andy wondered why a path that can sleep was using __flush_tlb_all() [1]
and Dave confirmed the expectation for TLB flush is for modifying /
invalidating existing PTE entries, but not initial population [2]. Drop
the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the
expectation that this path is only ever populating empty entries for the
linear map. Note, at linear map teardown time there is a call to the
all-cpu flush_tlb_all() to invalidate the removed mappings.

[1]: https://lkml.kernel.org/r/9DFD717D-857D-493D-A606-B635D72BAC21@amacapital.net
[2]: https://lkml.kernel.org/r/749919a4-cdb1-48a3-adb4-adb81a5fa0b5@intel.com

[ mingo: Minor readability edits. ]
Suggested-by: NDave Hansen <dave.hansen@linux.intel.com>
Reported-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@intel.com
Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()")
Link: http://lkml.kernel.org/r/154395944713.32119.15611079023837132638.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

49102719

x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=off · 86ba6f66

由 Michal Hocko 提交于 11月 13, 2018

commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream.

Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever
the system is deemed affected by L1TF vulnerability. Even though the limit
is quite high for most deployments it seems to be too restrictive for
deployments which are willing to live with the mitigation disabled.

We have a customer to deploy 8x 6,4TB PCIe/NVMe SSD swap devices which is
clearly out of the limit.

Drop the swap restriction when l1tf=off is specified. It also doesn't make
much sense to warn about too much memory for the l1tf mitigation when it is
forcefully disabled by the administrator.

[ tglx: Folded the documentation delta change ]

Fixes: 377eeaa8 ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2")
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NJiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: <linux-mm@kvack.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181113184910.26697-1-mhocko@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

86ba6f66

Input: elan_i2c - add ACPI ID for touchpad in ASUS Aspire F5-573G · aeb5e534

由 Patrick Dreyer 提交于 12月 23, 2018

commit 7db54c89f0b30a101584e09d3729144e6170059d upstream.

This adds ELAN0501 to the ACPI table to support Elan touchpad found in ASUS
Aspire F5-573G.
Signed-off-by: NPatrick Dreyer <Patrick.Dreyer@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

aeb5e534

Input: atmel_mxt_ts - don't try to free unallocated kernel memory · f1680565

由 Sanjeev Chugh 提交于 12月 28, 2018

commit 1e3c336ad8f40f88a8961c434640920fe35cc08b upstream.

If the user attempts to update Atmel device with an invalid configuration
cfg file, error handling code is trying to free cfg file memory which is
not allocated yet hence results into kernel crash.

This patch fixes the order of memory free operations.
Signed-off-by: NSanjeev Chugh <sanjeev_chugh@mentor.com>
Fixes: a4891f10 ("Input: atmel_mxt_ts - zero terminate config firmware file")
Cc: stable@vger.kernel.org
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

f1680565

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功