提交 · 5f77d6ca5ca74e4b4a5e2e010f7ff50c45dea326 · openeuler / Kernel

24 7月, 2020 1 次提交

iommu/vt-d: Enforce PASID devTLB field mask · 5f77d6ca

由 Liu Yi L 提交于 7月 24, 2020

Set proper masks to avoid invalid input spillover to reserved bits.
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200724014925.15523-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

5f77d6ca

28 6月, 2020 2 次提交

smp, irq_work: Continue smp_call_function*() and irq_work*() integration · 8c4890d1

由 Peter Zijlstra 提交于 6月 22, 2020

Instead of relying on BUG_ON() to ensure the various data structures
line up, use a bunch of horrible unions to make it all automatic.

Much of the union magic is to ensure irq_work and smp_call_function do
not (yet) see the members of their respective data structures change
name.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Reviewed-by: NFrederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/20200622100825.844455025@infradead.org

8c4890d1

sched/core: Fix CONFIG_GCC_PLUGIN_RANDSTRUCT build fail · 4f311afc

由 Peter Zijlstra 提交于 6月 10, 2020

As a temporary build fix, the proper cleanup needs more work.
Reported-by: NGuenter Roeck <linux@roeck-us.net>
Reported-by: NEric Biggers <ebiggers@kernel.org>
Suggested-by: NEric Biggers <ebiggers@kernel.org>
Suggested-by: NKees Cook <keescook@chromium.org>
Fixes: a1488664 ("sched: Replace rq::wake_list")
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4f311afc

26 6月, 2020 3 次提交

kdb: Switch to use safer dbg_io_ops over console APIs · 5946d1f5

由 Sumit Garg 提交于 6月 04, 2020

In kgdb context, calling console handlers aren't safe due to locks used
in those handlers which could in turn lead to a deadlock. Although, using
oops_in_progress increases the chance to bypass locks in most console
handlers but it might not be sufficient enough in case a console uses
more locks (VT/TTY is good example).

Currently when a driver provides both polling I/O and a console then kdb
will output using the console. We can increase robustness by using the
currently active polling I/O driver (which should be lockless) instead
of the corresponding console. For several common cases (e.g. an
embedded system with a single serial port that is used both for console
output and debugger I/O) this will result in no console handler being
used.

In order to achieve this we need to reverse the order of preference to
use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
store "struct console" that represents debugger I/O in dbg_io_ops and
while emitting kdb messages, skip console that matches dbg_io_ops
console in order to avoid duplicate messages. After this change,
"is_console" param becomes redundant and hence removed.
Suggested-by: NDaniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: NSumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/1591264879-25920-5-git-send-email-sumit.garg@linaro.orgReviewed-by: NDouglas Anderson <dianders@chromium.org>
Reviewed-by: NPetr Mladek <pmladek@suse.com>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>

5946d1f5

mm: remove vmalloc_exec · 7a0e27b2

由 Christoph Hellwig 提交于 6月 25, 2020

Merge vmalloc_exec into its only caller.  Note that for !CONFIG_MMU
__vmalloc_node_range maps to __vmalloc, which directly clears the
__GFP_HIGHMEM added by the vmalloc_exec stub anyway.

Link: http://lkml.kernel.org/r/20200618064307.32739-4-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7a0e27b2

mm: workingset: age nonresident information alongside anonymous pages · 31d8fcac

由 Johannes Weiner 提交于 6月 25, 2020

Patch series "fix for "mm: balance LRU lists based on relative
thrashing" patchset"

This patchset fixes some problems of the patchset, "mm: balance LRU
lists based on relative thrashing", which is now merged on the mainline.

Patch "mm: workingset: let cache workingset challenge anon fix" is the
result of discussion with Johannes.  See following link.

  http://lkml.kernel.org/r/20200520232525.798933-6-hannes@cmpxchg.org

And, the other two are minor things which are found when I try to rebase
my patchset.

This patch (of 3):

After ("mm: workingset: let cache workingset challenge anon fix"), we
compare refault distances to active_file + anon.  But age of the
non-resident information is only driven by the file LRU.  As a result,
we may overestimate the recency of any incoming refaults and activate
them too eagerly, causing unnecessary LRU churn in certain situations.

Make anon aging drive nonresident age as well to address that.

Link: http://lkml.kernel.org/r/1592288204-27734-1-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1592288204-27734-2-git-send-email-iamjoonsoo.kim@lge.com
Fixes: 34e58cac ("mm: workingset: let cache workingset challenge anon")
Reported-by: NJoonsoo Kim <js1304@gmail.com>
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

31d8fcac

25 6月, 2020 5 次提交

rcu: Fixup noinstr warnings · b58e733f

由 Peter Zijlstra 提交于 6月 15, 2020

A KCSAN build revealed we have explicit annoations through atomic_*()
usage, switch to arch_atomic_*() for the respective functions.

vmlinux.o: warning: objtool: rcu_nmi_exit()+0x4d: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x25: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_nmi_enter()+0x4f: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0x2a: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: __rcu_is_watching()+0x25: call to __kcsan_check_access() leaves .noinstr.text section

Additionally, without the NOP in instrumentation_begin(), objtool would
not detect the lack of the 'else instrumentation_begin();' branch in
rcu_nmi_enter().
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>

b58e733f

locking/atomics: Provide the arch_atomic_ interface to generic code · 5faafd56

由 Peter Zijlstra 提交于 6月 25, 2020

Architectures with instrumented (KASAN/KCSAN) atomic operations
natively provide arch_atomic_ variants that are not instrumented.

It turns out that some generic code also requires arch_atomic_ in
order to avoid instrumentation, so provide the arch_atomic_ interface
as a direct map into the regular atomic_ interface for
non-instrumented architectures.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>

5faafd56

netfilter: ip6tables: Split ip6t_unregister_table() into pre_exit and exit helpers. · 57ea5f18

由 David Wilder 提交于 6月 22, 2020

The pre_exit will un-register the underlying hook and .exit will do
the table freeing. The netns core does an unconditional synchronize_rcu
after the pre_exit hooks insuring no packets are in flight that have
picked up the pointer before completing the un-register.

Fixes: b9e69e12 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: NDavid Wilder <dwilder@us.ibm.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

57ea5f18

netfilter: iptables: Split ipt_unregister_table() into pre_exit and exit helpers. · 1cbf9098

由 David Wilder 提交于 6月 22, 2020

The pre_exit will un-register the underlying hook and .exit will do the
table freeing. The netns core does an unconditional synchronize_rcu after
the pre_exit hooks insuring no packets are in flight that have picked up
the pointer before completing the un-register.

Fixes: b9e69e12 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: NDavid Wilder <dwilder@us.ibm.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

1cbf9098

net: phy: make phy_disable_interrupts() non-static · 3dd4ef1b

由 Jisheng Zhang 提交于 6月 24, 2020

We face an issue with rtl8211f, a pin is shared between INTB and PMEB,
and the PHY Register Accessible Interrupt is enabled by default, so
the INTB/PMEB pin is always active in polling mode case.

As Heiner pointed out "I was thinking about calling
phy_disable_interrupts() in phy_init_hw(), to have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state."

Make phy_disable_interrupts() non-static so that it could be used in
phy_init_hw() to have a defined init state.
Suggested-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NJisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dd4ef1b

24 6月, 2020 2 次提交

scsi: libata: Fix the ata_scsi_dma_need_drain stub · aad4b4d1

由 Christoph Hellwig 提交于 6月 20, 2020

We not only need the stub when libata is disabled, but also if it is
modular and there are built-in SAS drivers (which can happen when
SCSI_SAS_ATA is disabled).

Link: https://lore.kernel.org/r/20200620071302.462974-2-hch@lst.de
Fixes: b8f1d1e0 ("scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

aad4b4d1

net: qed: fix left elements count calculation · 97dd1abd

由 Alexander Lobakin 提交于 6月 23, 2020

qed_chain_get_element_left{,_u32} returned 0 when the difference
between producer and consumer page count was equal to the total
page count.
Fix this by conditional expanding of producer value (vs
unconditional). This allowed to eliminate normalizaton against
total page count, which was the cause of this bug.

Misc: replace open-coded constants with common defines.

Fixes: a91eb52a ("qed: Revisit chain implementation")
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97dd1abd

23 6月, 2020 1 次提交

iommu/vt-d: Set U/S bit in first level page table by default · 16ecf10e

由 Lu Baolu 提交于 6月 23, 2020

When using first-level translation for IOVA, currently the U/S bit in the
page table is cleared which implies DMA requests with user privilege are
blocked. As the result, following error messages might be observed when
passing through a device to user level:

DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Read] Request device [41:00.0] PASID 1 fault addr 7ecdcd000
        [fault reason 129] SM: U/S set 0 for first-level translation
        with user privilege

This fixes it by setting U/S bit in the first level page table and makes
IOVA over first level compatible with previous second-level translation.

Fixes: b802d070 ("iommu/vt-d: Use iova over first level")
Reported-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-3-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

16ecf10e

20 6月, 2020 1 次提交

mm: Allow arches to provide ptep_get() · 481e980a

由 Christophe Leroy 提交于 6月 15, 2020

Since commit 9e343b46 ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses") it is not possible anymore to
use READ_ONCE() to access complex page table entries like the one
defined for powerpc 8xx with 16k size pages.

Define a ptep_get() helper that architectures can override instead
of performing a READ_ONCE() on the page table entry pointer.

Fixes: 9e343b46 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: NWill Deacon <will@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/087fa12b6e920e32315136b998aa834f99242695.1592225558.git.christophe.leroy@csgroup.eu

481e980a

19 6月, 2020 5 次提交

i2c: remove deprecated i2c_new_device API · 390fd047

由 Wolfram Sang 提交于 6月 15, 2020

All in-tree users have been converted to the new i2c_new_client_device
function, so remove this deprecated one.
Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: NWolfram Sang <wsa@kernel.org>

390fd047

net: core: reduce recursion limit value · fb7861d1

由 Taehee Yoo 提交于 6月 16, 2020

In the current code, ->ndo_start_xmit() can be executed recursively only
10 times because of stack memory.
But, in the case of the vxlan, 10 recursion limit value results in
a stack overflow.
In the current code, the nested interface is limited by 8 depth.
There is no critical reason that the recursion limitation value should
be 10.
So, it would be good to be the same value with the limitation value of
nesting interface depth.

Test commands:
    ip link add vxlan10 type vxlan vni 10 dstport 4789 srcport 4789 4789
    ip link set vxlan10 up
    ip a a 192.168.10.1/24 dev vxlan10
    ip n a 192.168.10.2 dev vxlan10 lladdr fc:22:33:44:55:66 nud permanent

    for i in {9..0}
    do
        let A=$i+1
	ip link add vxlan$i type vxlan vni $i dstport 4789 srcport 4789 4789
	ip link set vxlan$i up
	ip a a 192.168.$i.1/24 dev vxlan$i
	ip n a 192.168.$i.2 dev vxlan$i lladdr fc:22:33:44:55:66 nud permanent
	bridge fdb add fc:22:33:44:55:66 dev vxlan$A dst 192.168.$i.2 self
    done
    hping3 192.168.10.2 -2 -d 60000

Splat looks like:
[  103.814237][ T1127] =============================================================================
[  103.871955][ T1127] BUG kmalloc-2k (Tainted: G    B            ): Padding overwritten. 0x00000000897a2e4f-0x000
[  103.873187][ T1127] -----------------------------------------------------------------------------
[  103.873187][ T1127]
[  103.874252][ T1127] INFO: Slab 0x000000005cccc724 objects=5 used=5 fp=0x0000000000000000 flags=0x10000000001020
[  103.881323][ T1127] CPU: 3 PID: 1127 Comm: hping3 Tainted: G    B             5.7.0+ #575
[  103.882131][ T1127] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  103.883006][ T1127] Call Trace:
[  103.883324][ T1127]  dump_stack+0x96/0xdb
[  103.883716][ T1127]  slab_err+0xad/0xd0
[  103.884106][ T1127]  ? _raw_spin_unlock+0x1f/0x30
[  103.884620][ T1127]  ? get_partial_node.isra.78+0x140/0x360
[  103.885214][ T1127]  slab_pad_check.part.53+0xf7/0x160
[  103.885769][ T1127]  ? pskb_expand_head+0x110/0xe10
[  103.886316][ T1127]  check_slab+0x97/0xb0
[  103.886763][ T1127]  alloc_debug_processing+0x84/0x1a0
[  103.887308][ T1127]  ___slab_alloc+0x5a5/0x630
[  103.887765][ T1127]  ? pskb_expand_head+0x110/0xe10
[  103.888265][ T1127]  ? lock_downgrade+0x730/0x730
[  103.888762][ T1127]  ? pskb_expand_head+0x110/0xe10
[  103.889244][ T1127]  ? __slab_alloc+0x3e/0x80
[  103.889675][ T1127]  __slab_alloc+0x3e/0x80
[  103.890108][ T1127]  __kmalloc_node_track_caller+0xc7/0x420
[ ... ]

Fixes: 11a766ce ("net: Increase xmit RECURSION_LIMIT to 10.")
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fb7861d1

maccess: make get_kernel_nofault() check for minimal type compatibility · 0c389d89

由 Linus Torvalds 提交于 6月 18, 2020

Now that we've renamed probe_kernel_address() to get_kernel_nofault()
and made it look and behave more in line with get_user(), some of the
subtle type behavior differences end up being more obvious and possibly
dangerous.

When you do

        get_user(val, user_ptr);

the type of the access comes from the "user_ptr" part, and the above
basically acts as

        val = *user_ptr;

by design (except, of course, for the fact that the actual dereference
is done with a user access).

Note how in the above case, the type of the end result comes from the
pointer argument, and then the value is cast to the type of 'val' as
part of the assignment.

So the type of the pointer is ultimately the more important type both
for the access itself.

But 'get_kernel_nofault()' may now _look_ similar, but it behaves very
differently.  When you do

        get_kernel_nofault(val, kernel_ptr);

it behaves like

        val = *(typeof(val) *)kernel_ptr;

except, of course, for the fact that the actual dereference is done with
exception handling so that a faulting access is suppressed and returned
as the error code.

But note how different the casting behavior of the two superficially
similar accesses are: one does the actual access in the size of the type
the pointer points to, while the other does the access in the size of
the target, and ignores the pointer type entirely.

Actually changing get_kernel_nofault() to act like get_user() is almost
certainly the right thing to do eventually, but in the meantime this
patch adds logit to at least verify that the pointer type is compatible
with the type of the result.

In many cases, this involves just casting the pointer to 'void *' to
make it obvious that the type of the pointer is not the important part.
It's not how 'get_user()' acts, but at least the behavioral difference
is now obvious and explicit.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0c389d89

maccess: rename probe_kernel_address to get_kernel_nofault · 25f12ae4

由 Christoph Hellwig 提交于 6月 17, 2020

Better describe what this helper does, and match the naming of
copy_from_kernel_nofault.

Also switch the argument order around, so that it acts and looks
like get_user().
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

25f12ae4

sparse: use identifiers to define address spaces · 670d0a4b

由 Luc Van Oostenryck 提交于 6月 18, 2020

Currently, address spaces in warnings are displayed as '<asn:X>' with
'X' being the address space's arbitrary number.

But since sparse v0.6.0-rc1 (late December 2018), sparse allows you to
define the address spaces using an identifier instead of a number.  This
identifier is then directly used in the warnings.

So, use the identifiers '__user', '__iomem', '__percpu' & '__rcu' for
the corresponding address spaces.  The default address space, __kernel,
being not displayed in warnings, stays defined as '0'.

With this change, warnings that used to be displayed as:

	cast removes address space '<asn:1>' of expression
	... void [noderef] <asn:2> *

will now be displayed as:

	cast removes address space '__user' of expression
	... void [noderef] __iomem *

This also moves the __kernel annotation to be the first one, since it is
quite different from the others because it's the default one, and so:

 - it's never displayed

 - it's normally not needed, nor in type annotations, nor in cast
   between address spaces. The only time it's needed is when it's
   combined with a typeof to express "the same type as this one but
   without the address space"

 - it can't be defined with a name, '0' must be used.

So, it seemed strange to me to have it in the middle of the other
ones.
Signed-off-by: NLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

670d0a4b

18 6月, 2020 6 次提交

block: make function 'kill_bdev' static · 3373a346

由 Zheng Bin 提交于 6月 18, 2020

kill_bdev does not have any external user, so make it static.
Signed-off-by: NZheng Bin <zhengbin13@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3373a346

libata: Use per port sync for detach · b5292111

由 Kai-Heng Feng 提交于 6月 03, 2020

Commit 130f4caf ("libata: Ensure ata_port probe has completed before
detach") may cause system freeze during suspend.

Using async_synchronize_full() in PM callbacks is wrong, since async
callbacks that are already scheduled may wait for not-yet-scheduled
callbacks, causes a circular dependency.

Instead of using big hammer like async_synchronize_full(), use async
cookie to make sure port probe are synced, without affecting other
scheduled PM callbacks.

Fixes: 130f4caf ("libata: Ensure ata_port probe has completed before detach")
Suggested-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: NJohn Garry <john.garry@huawei.com>
BugLink: https://bugs.launchpad.net/bugs/1867983Signed-off-by: NJens Axboe <axboe@kernel.dk>

b5292111

RDMA/mlx5: Add missed RST2INIT and INIT2INIT steps during ECE handshake · ab183d46

由 Leon Romanovsky 提交于 6月 16, 2020

Missed steps during ECE handshake left userspace application with less
options for the ECE handshake. Pass ECE options in the additional
transitions.

Fixes: 50aec2c3 ("RDMA/mlx5: Return ECE data after modify QP")
Link: https://lore.kernel.org/r/20200616104536.2426384-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

ab183d46

timekeeping: Fix kerneldoc system_device_crosststamp & al · f097eb38

由 Kurt Kanzenbach 提交于 6月 09, 2020

Make kernel doc comments actually work and fix the syncronized typo.

[ tglx: Added the missing /** and fixed up formatting ]
Signed-off-by: NKurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200609081726.5657-1-kurt@linutronix.de

f097eb38

maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault · c0ee37e8

由 Christoph Hellwig 提交于 6月 17, 2020

Better describe what these functions do.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c0ee37e8

maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault · fe557319

由 Christoph Hellwig 提交于 6月 17, 2020

Better describe what these functions do.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe557319

17 6月, 2020 5 次提交

efi/libstub: arm: Print CPU boot mode and MMU state at boot · 2a55280a

由 Ard Biesheuvel 提交于 6月 07, 2020

On 32-bit ARM, we may boot at HYP mode, or with the MMU and caches off
(or both), even though the EFI spec does not actually support this.
While booting at HYP mode is something we might tolerate, fiddling
with the caches is a more serious issue, as disabling the caches is
tricky to do safely from C code, and running without the Dcache makes
it impossible to support unaligned memory accesses, which is another
explicit requirement imposed by the EFI spec.

So take note of the CPU mode and MMU state in the EFI stub diagnostic
output so that we can easily diagnose any issues that may arise from
this. E.g.,

  EFI stub: Entering in SVC mode with MMU enabled

Also, capture the CPSR and SCTLR system register values at EFI stub
entry, and after ExitBootServices() returns, and check whether the
MMU and Dcache were disabled at any point. If this is the case, a
diagnostic message like the following will be emitted:

  efi: [Firmware Bug]: EFI stub was entered with MMU and Dcache disabled, please fix your firmware!
  efi: CPSR at EFI stub entry        : 0x600001d3
  efi: SCTLR at EFI stub entry       : 0x00c51838
  efi: CPSR after ExitBootServices() : 0x600001d3
  efi: SCTLR after ExitBootServices(): 0x00c50838
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Reviewed-by: NLeif Lindholm <leif@nuviainc.com>

2a55280a

C
dma-direct: mark __dma_direct_alloc_pages static · 26749b32
由 Christoph Hellwig 提交于 6月 15, 2020
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
26749b32

overflow.h: Add flex_array_size() helper · b19d57d0

由 Gustavo A. R. Silva 提交于 6月 08, 2020

Add flex_array_size() helper for the calculation of the size, in bytes,
of a flexible array member contained within an enclosing structure.

Example of usage:

struct something {
	size_t count;
	struct foo items[];
};

struct something *instance;

instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;
memcpy(instance->items, src, flex_array_size(instance, items, instance->count));

The helper returns SIZE_MAX on overflow instead of wrapping around.

Additionally replaces parameter "n" with "count" in struct_size() helper
for greater clarity and unification.
Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20200609012233.GA3371@embeddedorSigned-off-by: NKees Cook <keescook@chromium.org>

b19d57d0

kretprobe: Prevent triggering kretprobe from within kprobe_flush_task · 9b38cc70

由 Jiri Olsa 提交于 5月 12, 2020

Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
My test was also able to trigger lockdep output:

 ============================================
 WARNING: possible recursive locking detected
 5.6.0-rc6+ #6 Not tainted
 --------------------------------------------
 sched-messaging/2767 is trying to acquire lock:
 ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0

 but task is already holding lock:
 ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&(kretprobe_table_locks[i].lock));
   lock(&(kretprobe_table_locks[i].lock));

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 1 lock held by sched-messaging/2767:
  #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

 stack backtrace:
 CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
 Call Trace:
  dump_stack+0x96/0xe0
  __lock_acquire.cold.57+0x173/0x2b7
  ? native_queued_spin_lock_slowpath+0x42b/0x9e0
  ? lockdep_hardirqs_on+0x590/0x590
  ? __lock_acquire+0xf63/0x4030
  lock_acquire+0x15a/0x3d0
  ? kretprobe_hash_lock+0x52/0xa0
  _raw_spin_lock_irqsave+0x36/0x70
  ? kretprobe_hash_lock+0x52/0xa0
  kretprobe_hash_lock+0x52/0xa0
  trampoline_handler+0xf8/0x940
  ? kprobe_fault_handler+0x380/0x380
  ? find_held_lock+0x3a/0x1c0
  kretprobe_trampoline+0x25/0x50
  ? lock_acquired+0x392/0xbc0
  ? _raw_spin_lock_irqsave+0x50/0x70
  ? __get_valid_kprobe+0x1f0/0x1f0
  ? _raw_spin_unlock_irqrestore+0x3b/0x40
  ? finish_task_switch+0x4b9/0x6d0
  ? __switch_to_asm+0x34/0x70
  ? __switch_to_asm+0x40/0x70

The code within the kretprobe handler checks for probe reentrancy,
so we won't trigger any _raw_spin_lock_irqsave probe in there.

The problem is in outside kprobe_flush_task, where we call:

  kprobe_flush_task
    kretprobe_table_lock
      raw_spin_lock_irqsave
        _raw_spin_lock_irqsave

where _raw_spin_lock_irqsave triggers the kretprobe and installs
kretprobe_trampoline handler on _raw_spin_lock_irqsave return.

The kretprobe_trampoline handler is then executed with already
locked kretprobe_table_locks, and first thing it does is to
lock kretprobe_table_locks ;-) the whole lockup path like:

  kprobe_flush_task
    kretprobe_table_lock
      raw_spin_lock_irqsave
        _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed

        ---> kretprobe_table_locks locked

        kretprobe_trampoline
          trampoline_handler
            kretprobe_hash_lock(current, &head, &flags);  <--- deadlock

Adding kprobe_busy_begin/end helpers that mark code with fake
probe installed to prevent triggering of another kprobe within
this code.

Using these helpers in kprobe_flush_task, so the probe recursion
protection check is hit and the probe is never set to prevent
above lockup.

Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2

Fixes: ef53d9c5 ("kprobes: improve kretprobe scalability with hashed locking")
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Reported-by: N"Ziqian SUN (Zamir)" <zsun@redhat.com>
Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

9b38cc70

gpu: host1x: Correct trivial kernel-doc inconsistencies · 2fd2bc7f

由 Colton Lewis 提交于 6月 14, 2020

Silence documentation build warnings by adding kernel-doc fields.

./include/linux/host1x.h:69: warning: Function parameter or member 'parent' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'usecount' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'lock' not described in 'host1x_client'
Signed-off-by: NColton Lewis <colton.w.lewis@protonmail.com>
Signed-off-by: NThierry Reding <treding@nvidia.com>

2fd2bc7f

16 6月, 2020 9 次提交

libceph: move away from global osd_req_flags · 22d2cfdf

由 Ilya Dryomov 提交于 6月 04, 2020

osd_req_flags is overly general and doesn't suit its only user
(read_from_replica option) well:

- applying osd_req_flags in account_request() affects all OSD
  requests, including linger (i.e. watch and notify).  However,
  linger requests should always go to the primary even though
  some of them are reads (e.g. notify has side effects but it
  is a read because it doesn't result in mutation on the OSDs).

- calls to class methods that are reads are allowed to go to
  the replica, but most such calls issued for "rbd map" and/or
  exclusive lock transitions are requested to be resent to the
  primary via EAGAIN, doubling the latency.

Get rid of global osd_req_flags and set read_from_replica flag
only on specific OSD requests instead.

Fixes: 8ad44d5e ("libceph: read_from_replica option")
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>

22d2cfdf

compiler_attributes.h: Support no_sanitize_undefined check with GCC 4 · 33aea07f

由 Marco Elver 提交于 6月 16, 2020

UBSAN is supported since GCC 4.9, which unfortunately did not yet have
__has_attribute(). To work around, the __GCC4_has_attribute workaround
requires defining which compiler version supports the given attribute.

In the case of no_sanitize_undefined, it is the first version that
supports UBSAN, which is GCC 4.9.
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NMarco Elver <elver@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Link: https://lkml.kernel.org/r/20200615231529.GA119644@google.com

33aea07f

tifm: Replace zero-length array with flexible-array · 5cab1634

由 Gustavo A. R. Silva 提交于 5月 28, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

5cab1634

sctp: Replace zero-length array with flexible-array · af6bb61c

由 Gustavo A. R. Silva 提交于 5月 28, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

af6bb61c

libata: Replace zero-length array with flexible-array · 9c5fbf05

由 Gustavo A. R. Silva 提交于 5月 28, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

9c5fbf05

kprobes: Replace zero-length array with flexible-array · 67a862a9

由 Gustavo A. R. Silva 提交于 5月 28, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

67a862a9

kexec: Replace zero-length array with flexible-array · 50b6951f

由 Gustavo A. R. Silva 提交于 5月 28, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

50b6951f

KVM: Replace zero-length array with flexible-array · 764e515f

由 Gustavo A. R. Silva 提交于 5月 28, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

764e515f

FS-Cache: Replace zero-length array with flexible-array · 67cd4624

由 Gustavo A. R. Silva 提交于 5月 28, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

67cd4624

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功