提交 · a524f7147741a4e5571bcd52dd4a8bbfb682d97d · openeuler / raspberrypi-kernel

27 12月, 2019 40 次提交

bio: fix improper use of smp_mb__before_atomic() · a524f714

由 Andrea Parri 提交于 6月 01, 2019

commit f381c6a4bd0ae0fde2d6340f1b9bb0f58d915de6 upstream.

This barrier only applies to the read-modify-write operations; in
particular, it does not apply to the atomic_set() primitive.

Replace the barrier with an smp_mb().

Fixes: dac56212 ("bio: skip atomic inc/dec of ->bi_cnt for most use cases")
Cc: stable@vger.kernel.org
Reported-by: N"Paul E. McKenney" <paulmck@linux.ibm.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NAndrea Parri <andrea.parri@amarulasolutions.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: linux-block@vger.kernel.org
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a524f714

bpf: add map_lookup_elem_sys_only for lookups from syscall side · 8e5a8547

由 Daniel Borkmann 提交于 5月 27, 2019

commit c6110222c6f49ea68169f353565eb865488a8619 upstream.

Add a callback map_lookup_elem_sys_only() that map implementations
could use over map_lookup_elem() from system call side in case the
map implementation needs to handle the latter differently than from
the BPF data path. If map_lookup_elem_sys_only() is set, this will
be preferred pick for map lookups out of user space. This hook is
used in a follow-up fix for LRU map, but once development window
opens, we can convert other map types from map_lookup_elem() (here,
the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use
the callback to simplify and clean up the latter.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

Conflicts:
  kernel/bpf/syscall.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

8e5a8547

bpf: Fix preempt_enable_no_resched() abuse · f9b8ebb6

由 Peter Zijlstra 提交于 5月 27, 2019

[ Upstream commit 0edd6b64d1939e9e9168ff27947995bb7751db5d ]

Unless the very next line is schedule(), or implies it, one must not use
preempt_enable_no_resched(). It can cause a preemption to go missing and
thereby cause arbitrary delays, breaking the PREEMPT=y invariant.

Cc: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f9b8ebb6

PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratum · 4af551d9

由 Stefan Mätje 提交于 5月 27, 2019

commit 4ec73791a64bab25cabf16a6067ee478692e506d upstream.

Due to an erratum in some Pericom PCIe-to-PCI bridges in reverse mode
(conventional PCI on primary side, PCIe on downstream side), the Retrain
Link bit needs to be cleared manually to allow the link training to
complete successfully.

If it is not cleared manually, the link training is continuously restarted
and no devices below the PCI-to-PCIe bridge can be accessed.  That means
drivers for devices below the bridge will be loaded but won't work and may
even crash because the driver is only reading 0xffff.

See the Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf for
details.  Devices known as affected so far are: PI7C9X110, PI7C9X111SL,
PI7C9X130.

Add a new flag, clear_retrain_link, in struct pci_dev.  Quirks for affected
devices set this bit.

Note that pcie_retrain_link() lives in aspm.c because that's currently the
only place we use it, but this erratum is not specific to ASPM, and we may
retrain links for other reasons in the future.
Signed-off-by: NStefan Mätje <stefan.maetje@esd.eu>
[bhelgaas: apply regardless of CONFIG_PCIEASPM]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

4af551d9

of: fix clang -Wunsequenced for be32_to_cpu() · c66ba8d1

由 Phong Tran 提交于 5月 27, 2019

commit 440868661f36071886ed360d91de83bd67c73b4f upstream.

Now, make the loop explicit to avoid clang warning.

./include/linux/of.h:238:37: warning: multiple unsequenced modifications
to 'cell' [-Wunsequenced]
                r = (r << 32) | be32_to_cpu(*(cell++));
                                                  ^~
./include/linux/byteorder/generic.h:95:21: note: expanded from macro
'be32_to_cpu'
                    ^
./include/uapi/linux/byteorder/little_endian.h:40:59: note: expanded
from macro '__be32_to_cpu'
                                                          ^
./include/uapi/linux/swab.h:118:21: note: expanded from macro '__swab32'
        ___constant_swab32(x) :                 \
                           ^
./include/uapi/linux/swab.h:18:12: note: expanded from macro
'___constant_swab32'
        (((__u32)(x) & (__u32)0x000000ffUL) << 24) |            \
                  ^
Signed-off-by: NPhong Tran <tranmanphong@gmail.com>
Reported-by: NNick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/460Suggested-by: NDavid Laight <David.Laight@ACULAB.COM>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Cc: stable@vger.kernel.org
[robh: fix up whitespace]
Signed-off-by: NRob Herring <robh@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c66ba8d1

dcache: sort the freeing-without-RCU-delay mess for good. · 257b3d2e

由 Al Viro 提交于 5月 27, 2019

commit 5467a68cbf6884c9a9d91e2a89140afb1839c835 upstream.

For lockless accesses to dentries we don't have pinned we rely
(among other things) upon having an RCU delay between dropping
the last reference and actually freeing the memory.

On the other hand, for things like pipes and sockets we neither
do that kind of lockless access, nor want to deal with the
overhead of an RCU delay every time a socket gets closed.

So delay was made optional - setting DCACHE_RCUACCESS in ->d_flags
made sure it would happen.  We tried to avoid setting it unless
we knew we need it.  Unfortunately, that had led to recurring
class of bugs, in which we missed the need to set it.

We only really need it for dentries that are created by
d_alloc_pseudo(), so let's not bother with trying to be smart -
just make having an RCU delay the default.  The ones that do
*not* get it set the replacement flag (DCACHE_NORCU) and we'd
better use that sparingly.  d_alloc_pseudo() is the only
such user right now.

FWIW, the race that finally prompted that switch had been
between __lock_parent() of immediate subdirectory of what's
currently the root of a disconnected tree (e.g. from
open-by-handle in progress) racing with d_splice_alias()
elsewhere picking another alias for the same inode, either
on outright corrupted fs image, or (in case of open-by-handle
on NFS) that subdirectory having been just moved on server.
It's not easy to hit, so the sky is not falling, but that's
not the first race on similar missed cases and the logics
for settinf DCACHE_RCUACCESS has gotten ridiculously
convoluted.

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

Conflicts:
  fs/dcache.c
  include/linux/dcache.h
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

257b3d2e

net: test nouarg before dereferencing zerocopy pointers · 8aea7324

由 Willem de Bruijn 提交于 5月 27, 2019

[ Upstream commit 185ce5c38ea76f29b6bd9c7c8c7a5e5408834920 ]

Zerocopy skbs without completion notification were added for packet
sockets with PACKET_TX_RING user buffers. Those signal completion
through the TP_STATUS_USER bit in the ring. Zerocopy annotation was
added only to avoid premature notification after clone or orphan, by
triggering a copy on these paths for these packets.

The mechanism had to define a special "no-uarg" mode because packet
sockets already use skb_uarg(skb) == skb_shinfo(skb)->destructor_arg
for a different pointer.

Before deferencing skb_uarg(skb), verify that it is a real pointer.

Fixes: 5cd8d46ea1562 ("packet: copy user buffers before orphan or clone")
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

8aea7324

jbd2: fix potential double free · f1fe6d18

由 Chengguang Xu 提交于 5月 25, 2019

commit 0d52154bb0a700abb459a2cbce0a30fc2549b67e upstream.

When failing from creating cache jbd2_inode_cache, we will destroy the
previously created cache jbd2_handle_cache twice.  This patch fixes
this by moving each cache initialization/destruction to its own
separate, individual function.
Signed-off-by: NChengguang Xu <cgxu519@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f1fe6d18

mfd: max77620: Fix swapped FPS_PERIOD_MAX_US values · b50978d8

由 Dmitry Osipenko 提交于 5月 25, 2019

commit ea611d1cc180fbb56982c83cd5142a2b34881f5c upstream.

The FPS_PERIOD_MAX_US definitions are swapped for MAX20024 and MAX77620,
fix it.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: NDmitry Osipenko <digetx@gmail.com>
Signed-off-by: NLee Jones <lee.jones@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

b50978d8

mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L · f79def13

由 Steve Twiss 提交于 5月 25, 2019

commit 6b4814a9451add06d457e198be418bf6a3e6a990 upstream.

Mismatch between what is found in the Datasheets for DA9063 and DA9063L
provided by Dialog Semiconductor, and the register names provided in the
MFD registers file. The changes are for the OTP (one-time-programming)
control registers. The two naming errors are OPT instead of OTP, and
COUNT instead of CONT (i.e. control).

Cc: Stable <stable@vger.kernel.org>
Signed-off-by: NSteve Twiss <stwiss.opensource@diasemi.com>
Signed-off-by: NLee Jones <lee.jones@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f79def13

hugetlb: use same fault hash key for shared and private mappings · f5a51bc2

由 Mike Kravetz 提交于 5月 25, 2019

commit 1b426bac66e6cc83c9f2d92b96e4e72acf43419a upstream.

hugetlb uses a fault mutex hash table to prevent page faults of the
same pages concurrently.  The key for shared and private mappings is
different.  Shared keys off address_space and file index.  Private keys
off mm and virtual address.  Consider a private mappings of a populated
hugetlbfs file.  A fault will map the page from the file and if needed
do a COW to map a writable page.

Hugetlbfs hole punch uses the fault mutex to prevent mappings of file
pages.  It uses the address_space file index key.  However, private
mappings will use a different key and could race with this code to map
the file page.  This causes problems (BUG) for the page cache remove
code as it expects the page to be unmapped.  A sample stack is:

page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
kernel BUG at mm/filemap.c:169!
...
RIP: 0010:unaccount_page_cache_page+0x1b8/0x200
...
Call Trace:
__delete_from_page_cache+0x39/0x220
delete_from_page_cache+0x45/0x70
remove_inode_hugepages+0x13c/0x380
? __add_to_page_cache_locked+0x162/0x380
hugetlbfs_fallocate+0x403/0x540
? _cond_resched+0x15/0x30
? __inode_security_revalidate+0x5d/0x70
? selinux_file_permission+0x100/0x130
vfs_fallocate+0x13f/0x270
ksys_fallocate+0x3c/0x80
__x64_sys_fallocate+0x1a/0x20
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9

There seems to be another potential COW issue/race with this approach
of different private and shared keys as noted in commit 8382d914
("mm, hugetlb: improve page-fault scalability").

Since every hugetlb mapping (even anon and private) is actually a file
mapping, just use the address_space index key for all mappings.  This
results in potentially more hash collisions.  However, this should not
be the common case.

Link: http://lkml.kernel.org/r/20190328234704.27083-3-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20190412165235.t4sscoujczfhuiyt@linux-r8p5
Fixes: b5cec28d ("hugetlbfs: truncate_hugepages() takes a range of pages")
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: NDavidlohr Bueso <dbueso@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f5a51bc2

mm/huge_memory: fix vmf_insert_pfn_{pmd, pud}() crash, handle unaligned addresses · e88938e2

由 Dan Williams 提交于 5月 25, 2019

commit fce86ff5802bac3a7b19db171aa1949ef9caac31 upstream.

Starting with c6f3c5ee40c1 ("mm/huge_memory.c: fix modifying of page
protection by insert_pfn_pmd()") vmf_insert_pfn_pmd() internally calls
pmdp_set_access_flags().  That helper enforces a pmd aligned @address
argument via VM_BUG_ON() assertion.

Update the implementation to take a 'struct vm_fault' argument directly
and apply the address alignment fixup internally to fix crash signatures
like:

    kernel BUG at arch/x86/mm/pgtable.c:515!
    invalid opcode: 0000 [#1] SMP NOPTI
    CPU: 51 PID: 43713 Comm: java Tainted: G           OE     4.19.35 #1
    [..]
    RIP: 0010:pmdp_set_access_flags+0x48/0x50
    [..]
    Call Trace:
     vmf_insert_pfn_pmd+0x198/0x350
     dax_iomap_fault+0xe82/0x1190
     ext4_dax_huge_fault+0x103/0x1f0
     ? __switch_to_asm+0x40/0x70
     __handle_mm_fault+0x3f6/0x1370
     ? __switch_to_asm+0x34/0x70
     ? __switch_to_asm+0x40/0x70
     handle_mm_fault+0xda/0x200
     __do_page_fault+0x249/0x4f0
     do_page_fault+0x32/0x110
     ? page_fault+0x8/0x30
     page_fault+0x1e/0x30

Link: http://lkml.kernel.org/r/155741946350.372037.11148198430068238140.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: c6f3c5ee40c1 ("mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()")
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Reported-by: NPiotr Balcer <piotr.balcer@intel.com>
Tested-by: NYan Ma <yan.ma@intel.com>
Tested-by: NPankaj Gupta <pagupta@redhat.com>
Reviewed-by: NMatthew Wilcox <willy@infradead.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Chandan Rajendra <chandan@linux.ibm.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

e88938e2

mm/hotplug: invalid PFNs from pfn_to_online_page() · ad03a914

由 Qian Cai 提交于 5月 24, 2019

mainline inclusion
from mainline-5.0-rc5
commit b13bc35193d9e7a8c050a24928ca5c9e7c9a009b
category: bugfix
bugzilla: NA
CVE: NA

Fixes: ccbe1e4d ("mm, compaction: skip over holes in __reset_isolation_suitable")

-------------------------------------------------

On an arm64 ThunderX2 server, the first kmemleak scan would crash [1]
with CONFIG_DEBUG_VM_PGFLAGS=y due to page_to_nid() found a pfn that is
not directly mapped (MEMBLOCK_NOMAP).  Hence, the page->flags is
uninitialized.

This is due to the commit 9f1eb38e0e11 ("mm, kmemleak: little
optimization while scanning") starts to use pfn_to_online_page() instead
of pfn_valid().  However, in the CONFIG_MEMORY_HOTPLUG=y case,
pfn_to_online_page() does not call memblock_is_map_memory() while
pfn_valid() does.

Historically, the commit 68709f45 ("arm64: only consider memblocks
with NOMAP cleared for linear mapping") causes pages marked as nomap
being no long reassigned to the new zone in memmap_init_zone() by
calling __init_single_page().

Since the commit 2d070eab ("mm: consider zone which is not fully
populated to have holes") introduced pfn_to_online_page() and was
designed to return a valid pfn only, but it is clearly broken on arm64.

Therefore, let pfn_to_online_page() call pfn_valid_within(), so it can
handle nomap thanks to the commit f52bb98f ("arm64: mm: always
enable CONFIG_HOLES_IN_ZONE"), while it will be optimized away on
architectures where have no HOLES_IN_ZONE.

[1]
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006
  Mem abort info:
    ESR = 0x96000005
    Exception class = DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
  Data abort info:
    ISV = 0, ISS = 0x00000005
    CM = 0, WnR = 0
  Internal error: Oops: 96000005 [#1] SMP
  CPU: 60 PID: 1408 Comm: kmemleak Not tainted 5.0.0-rc2+ #8
  pstate: 60400009 (nZCv daif +PAN -UAO)
  pc : page_mapping+0x24/0x144
  lr : __dump_page+0x34/0x3dc
  sp : ffff00003a5cfd10
  x29: ffff00003a5cfd10 x28: 000000000000802f
  x27: 0000000000000000 x26: 0000000000277d00
  x25: ffff000010791f56 x24: ffff7fe000000000
  x23: ffff000010772f8b x22: ffff00001125f670
  x21: ffff000011311000 x20: ffff000010772f8b
  x19: fffffffffffffffe x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000
  x15: 0000000000000000 x14: ffff802698b19600
  x13: ffff802698b1a200 x12: ffff802698b16f00
  x11: ffff802698b1a400 x10: 0000000000001400
  x9 : 0000000000000001 x8 : ffff00001121a000
  x7 : 0000000000000000 x6 : ffff0000102c53b8
  x5 : 0000000000000000 x4 : 0000000000000003
  x3 : 0000000000000100 x2 : 0000000000000000
  x1 : ffff000010772f8b x0 : ffffffffffffffff
  Process kmemleak (pid: 1408, stack limit = 0x(____ptrval____))
  Call trace:
   page_mapping+0x24/0x144
   __dump_page+0x34/0x3dc
   dump_page+0x28/0x4c
   kmemleak_scan+0x4ac/0x680
   kmemleak_scan_thread+0xb4/0xdc
   kthread+0x12c/0x13c
   ret_from_fork+0x10/0x18
  Code: d503201f f9400660 36000040 d1000413 (f9400661)
  ---[ end trace 4d4bd7f573490c8e ]---
  Kernel panic - not syncing: Fatal exception
  SMP: stopping secondary CPUs
  Kernel Offset: disabled
  CPU features: 0x002,20000c38
  Memory Limit: none
  ---[ end Kernel panic - not syncing: Fatal exception ]---

Link: http://lkml.kernel.org/r/20190122132916.28360-1-cai@lca.pw
Fixes: 9f1eb38e0e11 ("mm, kmemleak: little optimization while scanning")
Signed-off-by: NQian Cai <cai@lca.pw>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NYunFeng Ye <yeyunfeng@huawei.com>
Reviewed-by: Nzhong jiang <zhongjiang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

ad03a914

smpboot: Place the __percpu annotation correctly · a6fa4182

由 Sebastian Andrzej Siewior 提交于 5月 24, 2019

mainline inclusion
from mainline-5.2-rc1
commit d4645d30b50d
category: bugfix
bugzilla: 14410
CVE: NA

-------------------------------------------------

The test robot reported a wrong assignment of a per-CPU variable which
it detected by using sparse and sent a report. The assignment itself is
correct. The annotation for sparse was wrong and hence the report.
The first pointer is a "normal" pointer and points to the per-CPU memory
area. That means that the __percpu annotation has to be moved.

Move the __percpu annotation to pointer which points to the per-CPU
area. This change affects only the sparse tool (and is ignored by the
compiler).
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f97f8f06 ("smpboot: Provide infrastructure for percpu hotplug threads")
Link: http://lkml.kernel.org/r/20190424085253.12178-1-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a6fa4182

rtc: Fix timestamp value for RTC_TIMESTAMP_BEGIN_1900 · c2fd67bd

由 Geert Uytterhoeven 提交于 5月 24, 2019

mainline inclusion
from mainline-5.2-rc1
commit d3062d1d7415
category: bugfix
bugzilla: 15366
CVE: NA

-------------------------------------------------

Printing "mktime64(1900, 1, 1, 0, 0, 0)" gives -2208988800.

Fixes: 83bbc5ac ("rtc: Add useful timestamp definitions")
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c2fd67bd

iort: Clean up compiling errors/warnings for sva · 587bab28

由 Zhou Guanghui 提交于 5月 21, 2019

hulk inclusion
category: bugfix
bugzilla: 16081
CVE: NA
-------------------

1. use CONFIG_IOMMU_SVA macro to isolate iommu-sva related code in iort.
2. move struct page_response_msg,enum page_response_code and
iopf_queue_flush_t out of macro CONFIG_IOMMU_API, since these are always
used by iommu api.

Fixes: 676917d8930e ("iommu/sva: Track mm changes with an MMU notifier")
Fixes: 615753b31ae4 ("ACPI/IORT: Check ATS capability in root complex nodes")
Fixes: e741c308b09f ("iort: Read ACPI configure to get streamid.")
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>
Reviewed-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

587bab28

PCI: Make "PRG Response PASID Required" handling common · 3d416d5f

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

The PASID ECN to the PCIe spec added a bit in the PRI status register that
allows a Function to declare whether a PRG Response should contain the
PASID prefix or not.

Move the helper that accesses it from amd_iommu into the PCI subsystem,
renaming it to be consistent with the current PCI Express specification
(PRPR - PRG Response PASID Required).

Cc: bhelgaas@google.com
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

3d416d5f

ACPI/IORT: Check ATS capability in root complex nodes · a6fe5c66

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Root complex node in IORT has a bit telling whether it supports ATS or
not. Store this bit in the IOMMU fwspec when setting up a device, so it
can be accessed later by an IOMMU driver.

Use the negative version (NO_ATS) at the moment because it's not clear
if/how the bit needs to be integrated in other firmware descriptions. The
SMMU has a feature bit telling if it supports ATS, which might be
sufficient in most systems for deciding whether or not we should enable
the ATS capability in endpoints.

Cc: lorenzo.pieralisi@arm.com
Cc: hanjun.guo@linaro.org
Cc: sudeep.holla@arm.com
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a6fe5c66

iommu/of: Add stall and pasid properties to iommu_fwspec · 8356023a

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Add stall and pasid properties to iommu_fwspec, and fill them when
dma-can-stall and pasid-bits properties are present in the device tree.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

8356023a

iommu/sva: Register page fault handler · 33bce9ca

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Let users call iommu_sva_device_init() with the IOMMU_SVA_FEAT_IOPF flag,
that enables the I/O Page Fault queue. The IOMMU driver checks is the
device supports a form of page fault, in which case they add the device to
a fault queue. If the device doesn't support page faults, the IOMMU driver
aborts iommu_sva_device_init().

The fault queue must be flushed before any io_mm is freed, to make sure
that its PASID isn't used in any fault queue, and can be reallocated.
Add iopf_queue_flush() calls in a few strategic locations.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

33bce9ca

iommu: Add a page fault handler · 33850544

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Some systems allow devices to handle I/O Page Faults in the core mm. For
example systems implementing the PCI PRI extension or Arm SMMU stall
model. Infrastructure for reporting these recoverable page faults was
recently added to the IOMMU core for SVA virtualisation. Add a page fault
handler for host SVA.

IOMMU driver can now instantiate several fault workqueues and link them to
IOPF-capable devices. Drivers can choose between a single global
workqueue, one per IOMMU device, one per low-level fault queue, one per
domain, etc.

When it receives a fault event, supposedly in an IRQ handler, the IOMMU
driver reports the fault using iommu_report_device_fault(), which calls
the registered handler. The page fault handler then calls the mm fault
handler, and reports either success or failure with iommu_page_response().
When the handler succeeded, the IOMMU retries the access.

The iopf_param pointer could be embedded into iommu_fault_param. But
putting iopf_param into the iommu_param structure allows us not to care
about ordering between calls to iopf_queue_add_device() and
iommu_register_device_fault_handler().
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

33850544

iommu/sva: Search mm by PASID · b7573c82

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

The fault handler will need to find an mm given its PASID. This is the
reason we have an IDR for storing address spaces, so hook it up.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

b7573c82

iommu/sva: Track mm changes with an MMU notifier · a2f9e103

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

When creating an io_mm structure, register an MMU notifier that informs
us when the virtual address space changes and disappears.

Add a new operation to the IOMMU driver, mm_invalidate, called when a
range of addresses is unmapped to let the IOMMU driver send ATC
invalidations. mm_invalidate cannot sleep.

Adding the notifier complicates io_mm release. In one case device
drivers free the io_mm explicitly by calling unbind (or detaching the
device from its domain). In the other case the process could crash
before unbind, in which case the release notifier has to do all the
work.

Allowing the device driver's mm_exit() handler to sleep adds another
complication, but it will greatly simplify things for users. For example
VFIO can take the IOMMU mutex and remove any trace of io_mm, instead of
introducing complex synchronization to delicatly handle this race. But
relaxing the user side does force unbind() to sleep and wait for all
pending mm_exit() calls to finish.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a2f9e103

iommu/sva: Add a mm_exit callback for device drivers · 61b7f618

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

When an mm exits, devices that were bound to it must stop performing DMA
on its PASID. Let device drivers register a callback to be notified on mm
exit. Add the callback to the sva_param structure attached to struct
device.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

61b7f618

iommu/sva: Manage process address spaces · 59e55d73

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Allocate IOMMU mm structures and binding them to devices. Four operations
are added to IOMMU drivers:

* mm_alloc(): to create an io_mm structure and perform architecture-
  specific operations required to grab the process (for instance on ARM,
  pin down the CPU ASID so that the process doesn't get assigned a new
  ASID on rollover).

  There is a single valid io_mm structure per Linux mm. Future extensions
  may also use io_mm for kernel-managed address spaces, populated with
  map()/unmap() calls instead of bound to process address spaces. This
  patch focuses on "shared" io_mm.

* mm_attach(): attach an mm to a device. The IOMMU driver checks that the
  device is capable of sharing an address space, and writes the PASID
  table entry to install the pgd.

  Some IOMMU drivers will have a single PASID table per domain, for
  convenience. Other can implement it differently but to help these
  drivers, mm_attach and mm_detach take 'attach_domain' and
  'detach_domain' parameters, that tell whether they need to set and clear
  the PASID entry or only send the required TLB invalidations.

* mm_detach(): detach an mm from a device. The IOMMU driver removes the
  PASID table entry and invalidates the IOTLBs.

* mm_free(): free a structure allocated by mm_alloc(), and let arch
  release the process.

mm_attach and mm_detach operations are serialized with a spinlock. When
trying to optimize this code, we should at least prevent concurrent
attach()/detach() on the same domain (so multi-level PASID table code can
allocate tables lazily). mm_alloc() can sleep, but mm_free must not
(because we'll have to call it from call_srcu later on).

At the moment we use an IDR for allocating PASIDs and retrieving contexts.
We also use a single spinlock. These can be refined and optimized later (a
custom allocator will be needed for top-down PASID allocation).

Keeping track of address spaces requires the use of MMU notifiers.
Handling process exit with regard to unbind() is tricky, so it is left for
another patch and we explicitly fail mm_alloc() for the moment.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

59e55d73

iommu/sva: Bind process address spaces to devices · 353d72bc

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Add bind() and unbind() operations to the IOMMU API. Bind() returns a
PASID that drivers can program in hardware, to let their devices access an
mm. This patch only adds skeletons for the device driver API, most of the
implementation is still missing.

IOMMU groups with more than one device aren't supported for SVA at the
moment. There may be P2P traffic between devices within a group, which
cannot be seen by an IOMMU (note that supporting PASID doesn't add any
form of isolation with regard to P2P). Supporting groups would require
calling bind() for all bound processes every time a device is added to a
group, to perform sanity checks (e.g. ensure that new devices support
PASIDs at least as big as those already allocated in the group). It also
means making sure that reserved ranges (IOMMU_RESV_*) of all devices are
carved out of processes. This is already tricky with single devices, but
becomes very difficult with groups. Since SVA-capable devices are expected
to be cleanly isolated, and since we don't have any way to test groups or
hot-plug, we only allow singular groups for now.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

353d72bc

iommu: Introduce Shared Virtual Addressing API · 798b045d

由 Jean-Philippe Brucker 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Shared Virtual Addressing (SVA) provides a way for device drivers to bind
process address spaces to devices. This requires the IOMMU to support page
table format and features compatible with the CPUs, and usually requires
the system to support I/O Page Faults (IOPF) and Process Address Space ID
(PASID). When all of these are available, DMA can access virtual addresses
of a process. A PASID is allocated for each process, and the device driver
programs it into the device in an implementation-specific way.

Add a new API for sharing process page tables with devices. Introduce two
IOMMU operations, sva_device_init() and sva_device_shutdown(), that
prepare the IOMMU driver for SVA. For example allocate PASID tables and
fault queues. Subsequent patches will implement the bind() and unbind()
operations.

Support for I/O Page Faults will be added in a later patch using a new
feature bit (IOMMU_SVA_FEAT_IOPF). With the current API users must pin
down all shared mappings. Other feature bits that may be added in the
future are IOMMU_SVA_FEAT_PRIVATE, to support private PASID address
spaces, and IOMMU_SVA_FEAT_NO_PASID, to bind the whole device address
space to a process.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

798b045d

iommu: handle page response timeout · eb4c8a49

由 Jacob Pan 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

When IO page faults are reported outside IOMMU subsystem, the page
request handler may fail for various reasons. E.g. a guest received
page requests but did not have a chance to run for a long time. The
irresponsive behavior could hold off limited resources on the pending
device.
There can be hardware or credit based software solutions as suggested
in the PCI ATS Ch-4. To provide a basic safety net this patch
introduces a per device deferrable timer which monitors the longest
pending page fault that requires a response. Proper action such as
sending failure response code could be taken when timer expires but not
included in this patch. We need to consider the life cycle of page
groupd ID to prevent confusion with reused group ID by a device.
For now, a warning message provides clue of such failure.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

eb4c8a49

iommu: introduce page response function · a75dedeb

由 Jacob Pan 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

IO page faults can be handled outside IOMMU subsystem. For an example,
when nested translation is turned on and guest owns the
first level page tables, device page request can be forwared
to the guest for handling faults. As the page response returns
by the guest, IOMMU driver on the host need to process the
response which informs the device and completes the page request
transaction.

This patch introduces generic API function for page response
passing from the guest or other in-kernel users. The definitions of
the generic data is based on PCI ATS specification not limited to
any vendor.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lkml.org/lkml/2017/12/7/1725Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

a75dedeb

iommu: introduce device fault report API · ac084b72

由 Jacob Pan 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Traditionally, device specific faults are detected and handled within
their own device drivers. When IOMMU is enabled, faults such as DMA
related transactions are detected by IOMMU. There is no generic
reporting mechanism to report faults back to the in-kernel device
driver or the guest OS in case of assigned devices.

Faults detected by IOMMU is based on the transaction's source ID which
can be reported at per device basis, regardless of the device type is a
PCI device or not.

The fault types include recoverable (e.g. page request) and
unrecoverable faults(e.g. access error). In most cases, faults can be
handled by IOMMU drivers internally. The primary use cases are as
follows:
1. page request fault originated from an SVM capable device that is
assigned to guest via vIOMMU. In this case, the first level page tables
are owned by the guest. Page request must be propagated to the guest to
let guest OS fault in the pages then send page response. In this
mechanism, the direct receiver of IOMMU fault notification is VFIO,
which can relay notification events to QEMU or other user space
software.

2. faults need more subtle handling by device drivers. Other than
simply invoke reset function, there are needs to let device driver
handle the fault with a smaller impact.

This patchset is intended to create a generic fault report API such
that it can scale as follows:
- all IOMMU types
- PCI and non-PCI devices
- recoverable and unrecoverable faults
- VFIO and other other in kernel users
- DMA & IRQ remapping (TBD)
The original idea was brought up by David Woodhouse and discussions
summarized at https://lwn.net/Articles/608914/.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

ac084b72

driver core: add per device iommu param · 6d1740b5

由 Jacob Pan 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

DMA faults can be detected by IOMMU at device level. Adding a pointer
to struct device allows IOMMU subsystem to report relevant faults
back to the device driver for further handling.
For direct assigned device (or user space drivers), guest OS holds
responsibility to handle and respond per device IOMMU fault.
Therefore we need fault reporting mechanism to propagate faults beyond
IOMMU subsystem.

There are two other IOMMU data pointers under struct device today, here
we introduce iommu_param as a parent pointer such that all device IOMMU
data can be consolidated here. The idea was suggested here by Greg KH
and Joerg. The name iommu_param is chosen here since iommu_data has
been used.
Suggested-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Link: https://lkml.org/lkml/2017/10/6/81Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

6d1740b5

iommu: introduce device fault data · 5c398a27

由 Jacob Pan 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Device faults detected by IOMMU can be reported outside IOMMU
subsystem for further processing. This patch intends to provide
a generic device fault data such that device drivers can be
communicated with IOMMU faults without model specific knowledge.

The proposed format is the result of discussion at:
https://lkml.org/lkml/2017/11/10/291
Part of the code is based on Jean-Philippe Brucker's patchset
(https://patchwork.kernel.org/patch/9989315/).

The assumption is that model specific IOMMU driver can filter and
handle most of the internal faults if the cause is within IOMMU driver
control. Therefore, the fault reasons can be reported are grouped
and generalized based common specifications such as PCI ATS.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NLiu, Yi L <yi.l.liu@linux.intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

5c398a27

iommu: introduce iommu invalidate API function · 51c7a125

由 Liu, Yi L 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

When an SVM capable device is assigned to a guest, the first level page
tables are owned by the guest and the guest PASID table pointer is
linked to the device context entry of the physical IOMMU.

Host IOMMU driver has no knowledge of caching structure updates unless
the guest invalidation activities are passed down to the host. The
primary usage is derived from emulated IOMMU in the guest, where QEMU
can trap invalidation activities before passing them down to the
host/physical IOMMU.
Since the invalidation data are obtained from user space and will be
written into physical IOMMU, we must allow security check at various
layers. Therefore, generic invalidation data format are proposed here,
model specific IOMMU drivers need to convert them into their own format.
Signed-off-by: NLiu, Yi L <yi.l.liu@linux.intel.com>
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

51c7a125

iommu: introduce bind_pasid_table API function · d32e9e95

由 Jacob Pan 提交于 5月 18, 2019

hulk inclusion
category: feature
bugzilla: 14369
CVE: NA
-------------------

Virtual IOMMU was proposed to support Shared Virtual Memory (SVM)
use in the guest:
https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html

As part of the proposed architecture, when an SVM capable PCI
device is assigned to a guest, nested mode is turned on. Guest owns the
first level page tables (request with PASID) which performs GVA->GPA
translation. Second level page tables are owned by the host for GPA->HPA
translation for both request with and without PASID.

A new IOMMU driver interface is therefore needed to perform tasks as
follows:
* Enable nested translation and appropriate translation type
* Assign guest PASID table pointer (in GPA) and size to host IOMMU

This patch introduces new API functions to perform bind/unbind guest PASID
tables. Based on common data, model specific IOMMU drivers can be extended
to perform the specific steps for binding pasid table of assigned devices.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NLiu, Yi L <yi.l.liu@linux.intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
[Backported to 4.19
-add SPDX-License-Identifier]
Signed-off-by: NFang Lijun <fanglijun3@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

d32e9e95

KVM: fix spectrev1 gadgets · 579b95fc

由 Paolo Bonzini 提交于 5月 17, 2019

[ Upstream commit 1d487e9bf8ba66a7174c56a0029c54b1eca8f99c ]

These were found with smatch, and then generalized when applicable.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

579b95fc

x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T · 4778affe

由 Jian-Hong Pan 提交于 5月 17, 2019

[ Upstream commit 0082517fa4bce073e7cf542633439f26538a14cc ]

Upon reboot, the Acer TravelMate X514-51T laptop appears to complete the
shutdown process, but then it hangs in BIOS POST with a black screen.

The problem is intermittent - at some points it has appeared related to
Secure Boot settings or different kernel builds, but ultimately we have
not been able to identify the exact conditions that trigger the issue to
come and go.

Besides, the EFI mode cannot be disabled in the BIOS of this model.

However, after extensive testing, we observe that using the EFI reboot
method reliably avoids the issue in all cases.

So add a boot time quirk to use EFI reboot on such systems.

Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=203119Signed-off-by: NJian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: NDaniel Drake <drake@endlessm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Cc: linux@endlessm.com
Link: http://lkml.kernel.org/r/20190412080152.3718-1-jian-hong@endlessm.com
[ Fix !CONFIG_EFI build failure, clarify the code and the changelog a bit. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

4778affe

bfq: update internal depth state when queue depth changes · 3c8256e7

由 Jens Axboe 提交于 5月 17, 2019

commit 77f1e0a52d26242b6c2dba019f6ebebfb9ff701e upstream

A previous commit moved the shallow depth and BFQ depth map calculations
to be done at init time, moving it outside of the hotter IO path. This
potentially causes hangs if the users changes the depth of the scheduler
map, by writing to the 'nr_requests' sysfs file for that device.

Add a blk-mq-sched hook that allows blk-mq to inform the scheduler if
the depth changes, so that the scheduler can update its internal state.
Signed-off-by: NEric Wheeler <bfq@linux.ewheeler.net>
Tested-by: NKai Krakow <kai@kaishome.de>
Reported-by: NPaolo Valente <paolo.valente@linaro.org>
Fixes: f0635b8a ("bfq: calculate shallow depths at init time")
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

3c8256e7

cpu/speculation: Add 'mitigations=' cmdline option · 9b878097

由 Josh Poimboeuf 提交于 5月 16, 2019

commit 98af8452945c55652de68536afdde3b520fec429 upstream

Keeping track of the number of mitigations for all the CPU speculation
bugs has become overwhelming for many users.  It's getting more and more
complicated to decide which mitigations are needed for a given
architecture.  Complicating matters is the fact that each arch tends to
have its own custom way to mitigate the same vulnerability.

Most users fall into a few basic categories:

a) they want all mitigations off;

b) they want all reasonable mitigations on, with SMT enabled even if
   it's vulnerable; or

c) they want all reasonable mitigations on, with SMT disabled if
   vulnerable.

Define a set of curated, arch-independent options, each of which is an
aggregation of existing options:

- mitigations=off: Disable all mitigations.

- mitigations=auto: [default] Enable all the default mitigations, but
  leave SMT enabled, even if it's vulnerable.

- mitigations=auto,nosmt: Enable all the default mitigations, disabling
  SMT if needed by a mitigation.

Currently, these options are placeholders which don't actually do
anything.  They will be fleshed out in upcoming patches.
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86)
Reviewed-by: NJiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arch@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/b07a8ef9b7c5055c3a4637c87d07c296d5016fe0.1555085500.git.jpoimboe@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

9b878097

x86/speculation/mds: Add sysfs reporting for MDS · 6896d064

由 Thomas Gleixner 提交于 5月 16, 2019

commit 8a4b06d391b0a42a373808979b5028f5c84d9c6a upstream

Add the sysfs reporting file for MDS. It exposes the vulnerability and
mitigation state similar to the existing files for the other speculative
hardware vulnerabilities.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NJon Masters <jcm@redhat.com>
Tested-by: NJon Masters <jcm@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

6896d064

mm: Be allowed to alloc CDM node memory for MPOL_BIND · 1f3b5458

由 Lijun Fang 提交于 5月 13, 2019

euler inclusion
category: feature
bugzilla: 11082
CVE: NA
-----------------

CDM nodes should not be part of mems_allowed, However,
It must be allowed to alloc from CDM node, when mpol->mode was MPOL_BIND.
Signed-off-by: NLijun Fang <fanglijun3@huawei.com>
Reviewed-by: Nzhong jiang <zhongjiang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

1f3b5458