提交 · 2edc78b9a4b868d7bfee4f87ea29f2df19b6e955 · openeuler / Kernel

26 2月, 2020 1 次提交

icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n · a8e41f60

由 Jason A. Donenfeld 提交于 2月 25, 2020

The icmpv6_send function has long had a static inline implementation
with an empty body for CONFIG_IPV6=n, so that code calling it doesn't
need to be ifdef'd. The new icmpv6_ndo_send function, which is intended
for drivers as a drop-in replacement with an identical function
signature, should follow the same pattern. Without this patch, drivers
that used to work with CONFIG_IPV6=n now result in a linker error.

Cc: Chen Zhou <chenzhou10@huawei.com>
Reported-by: NHulk Robot <hulkci@huawei.com>
Fixes: 0b41713b ("icmp: introduce helper for nat'd source address in network device context")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8e41f60

25 2月, 2020 1 次提交

blktrace: Protect q->blk_trace with RCU · c780e86d

由 Jan Kara 提交于 2月 06, 2020

KASAN is reporting that __blk_add_trace() has a use-after-free issue
when accessing q->blk_trace. Indeed the switching of block tracing (and
thus eventual freeing of q->blk_trace) is completely unsynchronized with
the currently running tracing and thus it can happen that the blk_trace
structure is being freed just while __blk_add_trace() works on it.
Protect accesses to q->blk_trace by RCU during tracing and make sure we
wait for the end of RCU grace period when shutting down tracing. Luckily
that is rare enough event that we can afford that. Note that postponing
the freeing of blk_trace to an RCU callback should better be avoided as
it could have unexpected user visible side-effects as debugfs files
would be still existing for a short while block tracing has been shut
down.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711
CC: stable@vger.kernel.org
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Tested-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reported-by: NTristan Madani <tristmd@gmail.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c780e86d

22 2月, 2020 6 次提交

netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports · f66ee041

由 Jozsef Kadlecsik 提交于 2月 11, 2020

In the case of huge hash:* types of sets, due to the single spinlock of
a set the processing of the whole set under spinlock protection could take
too long.

There were four places where the whole hash table of the set was processed
from bucket to bucket under holding the spinlock:

- During resizing a set, the original set was locked to exclude kernel side
  add/del element operations (userspace add/del is excluded by the
  nfnetlink mutex). The original set is actually just read during the
  resize, so the spinlocking is replaced with rcu locking of regions.
  However, thus there can be parallel kernel side add/del of entries.
  In order not to loose those operations a backlog is added and replayed
  after the successful resize.
- Garbage collection of timed out entries was also protected by the spinlock.
  In order not to lock too long, region locking is introduced and a single
  region is processed in one gc go. Also, the simple timer based gc running
  is replaced with a workqueue based solution. The internal book-keeping
  (number of elements, size of extensions) is moved to region level due to
  the region locking.
- Adding elements: when the max number of the elements is reached, the gc
  was called to evict the timed out entries. The new approach is that the gc
  is called just for the matching region, assuming that if the region
  (proportionally) seems to be full, then the whole set does. We could scan
  the other regions to check every entry under rcu locking, but for huge
  sets it'd mean a slowdown at adding elements.
- Listing the set header data: when the set was defined with timeout
  support, the garbage collector was called to clean up timed out entries
  to get the correct element numbers and set size values. Now the set is
  scanned to check non-timed out entries, without actually calling the gc
  for the whole set.

Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe ->
SOFTIRQ-unsafe lock order issues during working on the patch.

Reported-by: syzbot+4b0e9d4ff3cf117837e5@syzkaller.appspotmail.com
Reported-by: syzbot+c27b8d5010f45c666ed1@syzkaller.appspotmail.com
Reported-by: syzbot+68a806795ac89df3aa1c@syzkaller.appspotmail.com
Fixes: 23c42a40 ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: NJozsef Kadlecsik <kadlec@netfilter.org>

f66ee041

include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap · 467d12f5

由 Christian Borntraeger 提交于 2月 20, 2020

QEMU has a funny new build error message when I use the upstream kernel
headers:

      CC      block/file-posix.o
    In file included from /home/cborntra/REPOS/qemu/include/qemu/timer.h:4,
                     from /home/cborntra/REPOS/qemu/include/qemu/timed-average.h:29,
                     from /home/cborntra/REPOS/qemu/include/block/accounting.h:28,
                     from /home/cborntra/REPOS/qemu/include/block/block_int.h:27,
                     from /home/cborntra/REPOS/qemu/block/file-posix.c:30:
    /usr/include/linux/swab.h: In function `__swab':
    /home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:34: error: "sizeof" is not defined, evaluates to 0 [-Werror=undef]
       20 | #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)
          |                                  ^~~~~~
    /home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:41: error: missing binary operator before token "("
       20 | #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)
          |                                         ^
    cc1: all warnings being treated as errors
    make: *** [/home/cborntra/REPOS/qemu/rules.mak:69: block/file-posix.o] Error 1
    rm tests/qemu-iotests/socket_scm_helper.o

This was triggered by commit d5767057 ("uapi: rename ext2_swab() to
swab() and share globally in swab.h").  That patch is doing

  #include <asm/bitsperlong.h>

but it uses BITS_PER_LONG.

The kernel file asm/bitsperlong.h provide only __BITS_PER_LONG.

Let us use the __ variant in swap.h

Link: http://lkml.kernel.org/r/20200213142147.17604-1-borntraeger@de.ibm.com
Fixes: d5767057 ("uapi: rename ext2_swab() to swab() and share globally in swab.h")
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Joe Perches <joe@perches.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Cc: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

467d12f5

y2038: hide timeval/timespec/itimerval/itimerspec types · c766d147

由 Arnd Bergmann 提交于 2月 20, 2020

There are no in-kernel users remaining, but there may still be users that
include linux/time.h instead of sys/time.h from user space, so leave the
types available to user space while hiding them from kernel space.

Only the __kernel_old_* versions of these types remain now.

Link: http://lkml.kernel.org/r/20200110154232.4104492-4-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c766d147

y2038: remove unused time32 interfaces · 412c53a6

由 Arnd Bergmann 提交于 2月 20, 2020

No users remain, so kill these off before we grow new ones.

Link: http://lkml.kernel.org/r/20200110154232.4104492-3-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

412c53a6

y2038: remove ktime to/from timespec/timeval conversion · 595abbaf

由 Arnd Bergmann 提交于 2月 20, 2020

A couple of helpers are now obsolete and can be removed, so drivers can no
longer start using them and instead use y2038-safe interfaces.

Link: http://lkml.kernel.org/r/20200110154232.4104492-2-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

595abbaf

ACPI: PM: s2idle: Check fixed wakeup events in acpi_s2idle_wake() · 63fb9623

由 Rafael J. Wysocki 提交于 2月 21, 2020

Commit fdde0ff8 ("ACPI: PM: s2idle: Prevent spurious SCIs from
waking up the system") overlooked the fact that fixed events can wake
up the system too and broke RTC wakeup from suspend-to-idle as a
result.

Fix this issue by checking the fixed events in acpi_s2idle_wake() in
addition to checking wakeup GPEs and break out of the suspend-to-idle
loop if the status bits of any enabled fixed events are set then.

Fixes: fdde0ff8 ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system")
Reported-and-tested-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

63fb9623

21 2月, 2020 2 次提交

genirq/irqdomain: Make sure all irq domain flags are distinct · 2546287c

由 Zenghui Yu 提交于 2月 21, 2020

This was noticed when printing debugfs for MSIs on my ARM64 server.  The
new dstate IRQD_MSI_NOMASK_QUIRK came out surprisingly while it should only
be the x86 stuff for the time being...

The new MSI quirk flag uses the same bit as IRQ_DOMAIN_NAME_ALLOCATED which
is oddly defined as bit 6 for no good reason.

Switch it to the non used bit 1.

Fixes: 6f1a4891 ("x86/apic/msi: Plug non-maskable MSI affinity race")
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200221020725.2038-1-yuzenghui@huawei.com

2546287c

bootconfig: Add bootconfig magic word for indicating bootconfig explicitly · 85c46b78

由 Masami Hiramatsu 提交于 2月 20, 2020

Add bootconfig magic word to the end of bootconfig on initrd
image for indicating explicitly the bootconfig is there.
Also tools/bootconfig treats wrong size or wrong checksum or
parse error as an error, because if there is a bootconfig magic
word, there must be a bootconfig.

The bootconfig magic word is "#BOOTCONFIG\n", 12 bytes word.
Thus the block image of the initrd file with bootconfig is
as follows.

[Initrd][bootconfig][size][csum][#BOOTCONFIG\n]

Link: http://lkml.kernel.org/r/158220112263.26565.3944814205960612841.stgit@devnote2Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

85c46b78

19 2月, 2020 1 次提交

net/mlx5: DR, Handle reformat capability over sw-steering tables · 13a7e459

由 Erez Shitrit 提交于 1月 14, 2020

On flow table creation, send the relevant flags according to what the FW
currently supports.
When FW doesn't support reformat option over SW-steering managed table,
the driver shouldn't pass this.

Fixes: 988fd6b3 ("net/mlx5: DR, Pass table flags at creation to lower layer")
Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

13a7e459

18 2月, 2020 2 次提交

bpf, uapi: Remove text about bpf_redirect_map() giving higher performance · f25975f4

由 Toke Høiland-Jørgensen 提交于 2月 18, 2020

The performance of bpf_redirect() is now roughly the same as that of
bpf_redirect_map(). However, David Ahern pointed out that the header file
has not been updated to reflect this, and still says that a significant
performance increase is possible when using bpf_redirect_map(). Remove this
text from the bpf_redirect_map() description, and reword the description in
bpf_redirect() slightly. Also fix the 'Return' section of the
bpf_redirect_map() documentation.

Fixes: 1d233886 ("xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths")
Reported-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NQuentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200218130334.29889-1-toke@redhat.com

f25975f4

net: sched: correct flower port blocking · 8a9093c7

由 Jason Baron 提交于 2月 17, 2020

tc flower rules that are based on src or dst port blocking are sometimes
ineffective due to uninitialized stack data. __skb_flow_dissect() extracts
ports from the skb for tc flower to match against. However, the port
dissection is not done when when the FLOW_DIS_IS_FRAGMENT bit is set in
key_control->flags. All callers of __skb_flow_dissect(), zero-out the
key_control field except for fl_classify() as used by the flower
classifier. Thus, the FLOW_DIS_IS_FRAGMENT may be set on entry to
__skb_flow_dissect(), since key_control is allocated on the stack
and may not be initialized.

Since key_basic and key_control are present for all flow keys, let's
make sure they are initialized.

Fixes: 62230715 ("flow_dissector: do not dissect l4 ports for fragments")
Co-developed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NJason Baron <jbaron@akamai.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a9093c7

17 2月, 2020 5 次提交

KVM: x86: fix missing prototypes · d970a325

由 Paolo Bonzini 提交于 2月 13, 2020

Reported with "make W=1" due to -Wmissing-prototypes.
Reported-by: NQian Cai <cai@lca.pw>
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d970a325

netfilter: conntrack: allow insertion of clashing entries · 6a757c07

由 Florian Westphal 提交于 2月 03, 2020

This patch further relaxes the need to drop an skb due to a clash with
an existing conntrack entry.

Current clash resolution handles the case where the clash occurs between
two identical entries (distinct nf_conn objects with same tuples), i.e.:

                    Original                        Reply
existing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.6:5353
clashing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.6:5353

... existing handling will discard the unconfirmed clashing entry and
makes skb->_nfct point to the existing one.  The skb can then be
processed normally just as if the clash would not have existed in the
first place.

For other clashes, the skb needs to be dropped.
This frequently happens with DNS resolvers that send A and AAAA queries
back-to-back when NAT rules are present that cause packets to get
different DNAT transformations applied, for example:

-m statistics --mode random ... -j DNAT --dnat-to 10.0.0.6:5353
-m statistics --mode random ... -j DNAT --dnat-to 10.0.0.7:5353

In this case the A or AAAA query is dropped which incurs a costly
delay during name resolution.

This patch also allows this collision type:
                       Original                   Reply
existing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.6:5353
clashing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.7:5353

In this case, clash is in original direction -- the reply direction
is still unique.

The change makes it so that when the 2nd colliding packet is received,
the clashing conntrack is tagged with new IPS_NAT_CLASH_BIT, gets a fixed
1 second timeout and is inserted in the reply direction only.

The entry is hidden from 'conntrack -L', it will time out quickly
and it can be early dropped because it will never progress to the
ASSURED state.

To avoid special-casing the delete code path to special case
the ORIGINAL hlist_nulls node, a new helper, "hlist_nulls_add_fake", is
added so hlist_nulls_del() will work.

Example:

      CPU A:                               CPU B:
1.  10.2.3.4:42 -> 10.8.8.8:53 (A)
2.                                         10.2.3.4:42 -> 10.8.8.8:53 (AAAA)
3.  Apply DNAT, reply changed to 10.0.0.6
4.                                         10.2.3.4:42 -> 10.8.8.8:53 (AAAA)
5.                                         Apply DNAT, reply changed to 10.0.0.7
6. confirm/commit to conntrack table, no collisions
7.                                         commit clashing entry

Reply comes in:

10.2.3.4:42 <- 10.0.0.6:5353 (A)
 -> Finds a conntrack, DNAT is reversed & packet forwarded to 10.2.3.4:42
10.2.3.4:42 <- 10.0.0.7:5353 (AAAA)
 -> Finds a conntrack, DNAT is reversed & packet forwarded to 10.2.3.4:42
    The conntrack entry is deleted from table, as it has the NAT_CLASH
    bit set.

In case of a retransmit from ORIGINAL dir, all further packets will get
the DNAT transformation to 10.0.0.6.

I tried to come up with other solutions but they all have worse
problems.

Alternatives considered were:
1.  Confirm ct entries at allocation time, not in postrouting.
 a. will cause uneccesarry work when the skb that creates the
    conntrack is dropped by ruleset.
 b. in case nat is applied, ct entry would need to be moved in
    the table, which requires another spinlock pair to be taken.
 c. breaks the 'unconfirmed entry is private to cpu' assumption:
    we would need to guard all nfct->ext allocation requests with
    ct->lock spinlock.

2. Make the unconfirmed list a hash table instead of a pcpu list.
   Shares drawback c) of the first alternative.

3. Document this is expected and force users to rearrange their
   ruleset (e.g. by using "-m cluster" instead of "-m statistics").
   nft has the 'jhash' expression which can be used instead of 'numgen'.

   Major drawback: doesn't fix what I consider a bug, not very realistic
   and I believe its reasonable to have the existing rulesets to 'just
   work'.

4. Document this is expected and force users to steer problematic
   packets to the same CPU -- this would serialize the "allocate new
   conntrack entry/nat table evaluation/perform nat/confirm entry", so
   no race can occur.  Similar drawback to 3.

Another advantage of this patch compared to 1) and 2) is that there are
no changes to the hot path; things are handled in the udp tracker and
the clash resolution path.

Cc: rcu@vger.kernel.org
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

6a757c07

skbuff.h: fix all kernel-doc warnings · d2f273f0

由 Randy Dunlap 提交于 2月 15, 2020

Fix all kernel-doc warnings in <linux/skbuff.h>.
Fixes these warnings:

../include/linux/skbuff.h:890: warning: Function parameter or member 'list' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'dev_scratch' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'ip_defrag_offset' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'skb_mstamp_ns' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member '__cloned_offset' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'head_frag' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member '__pkt_type_offset' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'encapsulation' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'encap_hdr_csum' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'csum_valid' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member '__pkt_vlan_present_offset' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'vlan_present' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'csum_complete_sw' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'csum_level' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'inner_protocol_type' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'remcsum_offload' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'sender_cpu' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'reserved_tailroom' not described in 'sk_buff'
../include/linux/skbuff.h:890: warning: Function parameter or member 'inner_ipproto' not described in 'sk_buff'
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2f273f0

net/sock.h: fix all kernel-doc warnings · 66256e0b

由 Randy Dunlap 提交于 2月 15, 2020

Fix all kernel-doc warnings for <net/sock.h>.
Fixes these warnings:

../include/net/sock.h:232: warning: Function parameter or member 'skc_addrpair' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_portpair' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_ipv6only' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_net_refcnt' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_v6_daddr' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_v6_rcv_saddr' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_cookie' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_listener' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_tw_dr' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_rcv_wnd' not described in 'sock_common'
../include/net/sock.h:232: warning: Function parameter or member 'skc_tw_rcv_nxt' not described in 'sock_common'

../include/net/sock.h:498: warning: Function parameter or member 'sk_rx_skb_cache' not described in 'sock'
../include/net/sock.h:498: warning: Function parameter or member 'sk_wq_raw' not described in 'sock'
../include/net/sock.h:498: warning: Function parameter or member 'tcp_rtx_queue' not described in 'sock'
../include/net/sock.h:498: warning: Function parameter or member 'sk_tx_skb_cache' not described in 'sock'
../include/net/sock.h:498: warning: Function parameter or member 'sk_route_forced_caps' not described in 'sock'
../include/net/sock.h:498: warning: Function parameter or member 'sk_txtime_report_errors' not described in 'sock'
../include/net/sock.h:498: warning: Function parameter or member 'sk_validate_xmit_skb' not described in 'sock'
../include/net/sock.h:498: warning: Function parameter or member 'sk_bpf_storage' not described in 'sock'

../include/net/sock.h:2024: warning: No description found for return value of 'sk_wmem_alloc_get'
../include/net/sock.h:2035: warning: No description found for return value of 'sk_rmem_alloc_get'
../include/net/sock.h:2046: warning: No description found for return value of 'sk_has_allocations'
../include/net/sock.h:2082: warning: No description found for return value of 'skwq_has_sleeper'
../include/net/sock.h:2244: warning: No description found for return value of 'sk_page_frag'
../include/net/sock.h:2444: warning: Function parameter or member 'tcp_rx_skb_cache_key' not described in 'DECLARE_STATIC_KEY_FALSE'
../include/net/sock.h:2444: warning: Excess function parameter 'sk' description in 'DECLARE_STATIC_KEY_FALSE'
../include/net/sock.h:2444: warning: Excess function parameter 'skb' description in 'DECLARE_STATIC_KEY_FALSE'
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66256e0b

net: export netdev_next_lower_dev_rcu() · 7151affe

由 Taehee Yoo 提交于 2月 15, 2020

netdev_next_lower_dev_rcu() will be used to implement a function,
which is to walk all lower interfaces.
There are already functions that they walk their lower interface.
(netdev_walk_all_lower_dev_rcu, netdev_walk_all_lower_dev()).
But, there would be cases that couldn't be covered by given
netdev_walk_all_lower_dev_{rcu}() function.
So, some modules would want to implement own function,
which is to walk all lower interfaces.

In the next patch, netdev_next_lower_dev_rcu() will be used.
In addition, this patch removes two unused prototypes in netdevice.h.
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7151affe

15 2月, 2020 2 次提交

scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session" · 807b9515

由 Bart Van Assche 提交于 2月 12, 2020

Since commit e9d3009c introduced a regression and since the fix for
that regression was not perfect, revert this commit.

Link: https://marc.info/?l=target-devel&m=158157054906195
Cc: Rahul Kundu <rahul.kundu@chelsio.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Reported-by: NDakshaja Uppalapati <dakshaja@chelsio.com>
Fixes: e9d3009c ("scsi: target: iscsi: Wait for all commands to finish before freeing a session")
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

807b9515

ASoC: dapm: remove snd_soc_dapm_put_enum_double_locked · 8f486296

由 Tzung-Bi Shih 提交于 2月 14, 2020

Reverts commit 839284e7 ("ASoC: dapm: add
snd_soc_dapm_put_enum_double_locked").
Signed-off-by: NTzung-Bi Shih <tzungbi@google.com>
Link: https://lore.kernel.org/r/20200214105744.82258-3-tzungbi@google.comSigned-off-by: NMark Brown <broonie@kernel.org>

8f486296

14 2月, 2020 5 次提交

netdevice.h: fix all kernel-doc and Sphinx warnings · a1fa83bd

由 Randy Dunlap 提交于 2月 12, 2020

Eliminate all kernel-doc and Sphinx warnings in
<linux/netdevice.h>.  Fixes these warnings:

../include/linux/netdevice.h:2100: warning: Function parameter or member 'gso_partial_features' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'l3mdev_ops' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xfrmdev_ops' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'tlsdev_ops' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'name_assign_type' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'ieee802154_ptr' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'mpls_ptr' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xdp_prog' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'gro_flush_timeout' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xdp_bulkq' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xps_cpus_map' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xps_rxqs_map' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'qdisc_hash' not described in 'net_device'
../include/linux/netdevice.h:3552: WARNING: Inline emphasis start-string without end-string.
../include/linux/netdevice.h:3552: WARNING: Inline emphasis start-string without end-string.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1fa83bd

ALSA: rawmidi: Avoid bit fields for state flags · dfa9a5ef

由 Takashi Iwai 提交于 2月 14, 2020

The rawmidi state flags (opened, append, active_sensing) are stored in
bit fields that can be potentially racy when concurrently accessed
without any locks. Although the current code should be fine, there is
also no any real benefit by keeping the bitfields for this kind of
short number of members.

This patch changes those bit fields flags to the simple bool fields.
There should be no size increase of the snd_rawmidi_substream by this
change.

Reported-by: syzbot+576cc007eb9f2c968200@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20200214111316.26939-4-tiwai@suse.deSigned-off-by: NTakashi Iwai <tiwai@suse.de>

dfa9a5ef

ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro · 1dade3a7

由 Mika Westerberg 提交于 2月 12, 2020

Sometimes it is useful to find the access_width field value in bytes and
not in bits so add a helper that can be used for this purpose.
Suggested-by: NJean Delvare <jdelvare@suse.de>
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: NJean Delvare <jdelvare@suse.de>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1dade3a7

icmp: introduce helper for nat'd source address in network device context · 0b41713b

由 Jason A. Donenfeld 提交于 2月 11, 2020

This introduces a helper function to be called only by network drivers
that wraps calls to icmp[v6]_send in a conntrack transformation, in case
NAT has been used. We don't want to pollute the non-driver path, though,
so we introduce this as a helper to be called by places that actually
make use of this, as suggested by Florian.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b41713b

net/flow_dissector: remove unexist field description · 6ee2deb6

由 Hangbin Liu 提交于 2月 11, 2020

@thoff has moved to struct flow_dissector_key_control.

Fixes: 42aecaa9 ("net: Get skb hash over flow_keys structure")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ee2deb6

13 2月, 2020 2 次提交

linux/pipe_fs_i.h: fix kernel-doc warnings after @wait was split · 0bf999f9

由 Randy Dunlap 提交于 2月 09, 2020

Fix kernel-doc warnings in struct pipe_inode_info after @wait was
split into @rd_wait and @wr_wait.

include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'rd_wait' not described in 'pipe_inode_info'
include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'wr_wait' not described in 'pipe_inode_info'

Fixes: 0ddad21d ("pipe: use exclusive waits when reading or writing")
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0bf999f9

NFSv4: Fix revalidation of dentries with delegations · efeda80d

由 Trond Myklebust 提交于 2月 05, 2020

If a dentry was not initially looked up while we were holding a
delegation, then we do still need to revalidate that it still holds
the same name. If there are multiple hard links to the same file,
then all the hard links need validation.
Reported-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: NBenjamin Coddington <bcodding@redhat.com>
Tested-by: NBenjamin Coddington <bcodding@redhat.com>
[Anna: Put nfs_unset_verifier_delegated() under CONFIG_NFS_V4]
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

efeda80d

12 2月, 2020 2 次提交

HID: core: increase HID report buffer size to 8KiB · 84a40626

由 Johan Korsnes 提交于 1月 17, 2020

We have a HID touch device that reports its opens and shorts test
results in HID buffers of size 8184 bytes. The maximum size of the HID
buffer is currently set to 4096 bytes, causing probe of this device to
fail. With this patch we increase the maximum size of the HID buffer to
8192 bytes, making device probe and acquisition of said buffers succeed.
Signed-off-by: NJohan Korsnes <jkorsnes@cisco.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Armando Visconti <armando.visconti@st.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

84a40626

ACPICA: Introduce acpi_any_gpe_status_set() · ea128834

由 Rafael J. Wysocki 提交于 2月 11, 2020

Introduce a new helper function, acpi_any_gpe_status_set(), for
checking the status bits of all enabled GPEs in one go.

It is needed to distinguish spurious SCIs from genuine ones when
deciding whether or not to wake up the system from suspend-to-idle.

Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ea128834

11 2月, 2020 5 次提交

ACPI: PM: s2idle: Avoid possible race related to the EC GPE · e3728b50

由 Rafael J. Wysocki 提交于 2月 11, 2020

It is theoretically possible for the ACPI EC GPE to be set after the
s2idle_ops->wake() called from s2idle_loop() has returned and before
the subsequent pm_wakeup_pending() check is carried out.  If that
happens, the resulting wakeup event will cause the system to resume
even though it may be a spurious one.

To avoid that race, first make the ->wake() callback in struct
platform_s2idle_ops return a bool value indicating whether or not
to let the system resume and rearrange s2idle_loop() to use that
value instad of the direct pm_wakeup_pending() call if ->wake() is
present.

Next, rework acpi_s2idle_wake() to process EC events and check
pm_wakeup_pending() before re-arming the SCI for system wakeup
to prevent it from triggering prematurely and add comments to
that function to explain the rationale for the new code flow.

Fixes: 56b99184 ("PM: sleep: Simplify suspend-to-idle control flow")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e3728b50

tracing: Consolidate trace() functions · 7276531d

由 Tom Zanussi 提交于 2月 10, 2020

Move the checking, buffer reserve and buffer commit code in
synth_event_trace_start/end() into inline functions
__synth_event_trace_start/end() so they can also be used by
synth_event_trace() and synth_event_trace_array(), and then have all
those functions use them.

Also, change synth_event_trace_state.enabled to disabled so it only
needs to be set if the event is disabled, which is not normally the
case.

Link: http://lkml.kernel.org/r/b1f3108d0f450e58192955a300e31d0405ab4149.1581374549.git.zanussi@kernel.orgSigned-off-by: NTom Zanussi <zanussi@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

7276531d

serdev: ttyport: restore client ops on deregistration · 0c5aae59

由 Johan Hovold 提交于 2月 10, 2020

The serdev tty-port controller driver should reset the tty-port client
operations also on deregistration to avoid a NULL-pointer dereference in
case the port is later re-registered as a normal tty device.

Note that this can only happen with tty drivers such as 8250 which have
statically allocated port structures that can end up being reused and
where a later registration would not register a serdev controller (e.g.
due to registration errors or if the devicetree has been changed in
between).

Specifically, this can be an issue for any statically defined ports that
would be registered by 8250 core when an 8250 driver is being unbound.

Fixes: bed35c6d ("serdev: add a tty port controller driver")
Cc: stable <stable@vger.kernel.org> # 4.11
Reported-by: NLoic Poulain <loic.poulain@linaro.org>
Signed-off-by: NJohan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20200210145730.22762-1-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0c5aae59

USB: core: add endpoint-blacklist quirk · 73f8bda9

由 Johan Hovold 提交于 2月 03, 2020

Add a new device quirk that can be used to blacklist endpoints.

Since commit 3e4f8e21 ("USB: core: fix check for duplicate
endpoints") USB core ignores any duplicate endpoints found during
descriptor parsing.

In order to handle devices where the first interfaces with duplicate
endpoints are the ones that should have their endpoints ignored, we need
to add a blacklist.
Tested-by: Nedes <edes@gmx.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: NJohan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20200203153830.26394-2-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

73f8bda9

usb: charger: assign specific number for enum value · ca4b43c1

由 Peter Chen 提交于 2月 01, 2020

To work properly on every architectures and compilers, the enum value
needs to be specific numbers.
Suggested-by: NGreg KH <gregkh@linuxfoundation.org>
Signed-off-by: NPeter Chen <peter.chen@nxp.com>
Link: https://lore.kernel.org/r/1580537624-10179-1-git-send-email-peter.chen@nxp.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ca4b43c1

10 2月, 2020 1 次提交

iommu/vt-d: Fix compile warning from intel-svm.h · e7598fac

由 Joerg Roedel 提交于 2月 10, 2020

The intel_svm_is_pasid_valid() needs to be marked inline, otherwise it
causes the compile warning below:

  CC [M]  drivers/dma/idxd/cdev.o
In file included from drivers/dma/idxd/cdev.c:9:0:
./include/linux/intel-svm.h:125:12: warning: ‘intel_svm_is_pasid_valid’ defined but not used [-Wunused-function]
 static int intel_svm_is_pasid_valid(struct device *dev, int pasid)
            ^~~~~~~~~~~~~~~~~~~~~~~~
Reported-by: NBorislav Petkov <bp@alien8.de>
Fixes: 15060aba ('iommu/vt-d: Helper function to query if a pasid has any active users')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e7598fac

09 2月, 2020 1 次提交

pipe: use exclusive waits when reading or writing · 0ddad21d

由 Linus Torvalds 提交于 12月 09, 2019

This makes the pipe code use separate wait-queues and exclusive waiting
for readers and writers, avoiding a nasty thundering herd problem when
there are lots of readers waiting for data on a pipe (or, less commonly,
lots of writers waiting for a pipe to have space).

While this isn't a common occurrence in the traditional "use a pipe as a
data transport" case, where you typically only have a single reader and
a single writer process, there is one common special case: using a pipe
as a source of "locking tokens" rather than for data communication.

In particular, the GNU make jobserver code ends up using a pipe as a way
to limit parallelism, where each job consumes a token by reading a byte
from the jobserver pipe, and releases the token by writing a byte back
to the pipe.

This pattern is fairly traditional on Unix, and works very well, but
will waste a lot of time waking up a lot of processes when only a single
reader needs to be woken up when a writer releases a new token.

A simplified test-case of just this pipe interaction is to create 64
processes, and then pass a single token around between them (this
test-case also intentionally passes another token that gets ignored to
test the "wake up next" logic too, in case anybody wonders about it):

    #include <unistd.h>

    int main(int argc, char **argv)
    {
        int fd[2], counters[2];

        pipe(fd);
        counters[0] = 0;
        counters[1] = -1;
        write(fd[1], counters, sizeof(counters));

        /* 64 processes */
        fork(); fork(); fork(); fork(); fork(); fork();

        do {
                int i;
                read(fd[0], &i, sizeof(i));
                if (i < 0)
                        continue;
                counters[0] = i+1;
                write(fd[1], counters, (1+(i & 1)) *sizeof(int));
        } while (counters[0] < 1000000);
        return 0;
    }

and in a perfect world, passing that token around should only cause one
context switch per transfer, when the writer of a token causes a
directed wakeup of just a single reader.

But with the "writer wakes all readers" model we traditionally had, on
my test box the above case causes more than an order of magnitude more
scheduling: instead of the expected ~1M context switches, "perf stat"
shows

        231,852.37 msec task-clock                #   15.857 CPUs utilized
        11,250,961      context-switches          #    0.049 M/sec
           616,304      cpu-migrations            #    0.003 M/sec
             1,648      page-faults               #    0.007 K/sec
 1,097,903,998,514      cycles                    #    4.735 GHz
   120,781,778,352      instructions              #    0.11  insn per cycle
    27,997,056,043      branches                  #  120.754 M/sec
       283,581,233      branch-misses             #    1.01% of all branches

      14.621273891 seconds time elapsed

       0.018243000 seconds user
       3.611468000 seconds sys

before this commit.

After this commit, I get

          5,229.55 msec task-clock                #    3.072 CPUs utilized
         1,212,233      context-switches          #    0.232 M/sec
           103,951      cpu-migrations            #    0.020 M/sec
             1,328      page-faults               #    0.254 K/sec
    21,307,456,166      cycles                    #    4.074 GHz
    12,947,819,999      instructions              #    0.61  insn per cycle
     2,881,985,678      branches                  #  551.096 M/sec
        64,267,015      branch-misses             #    2.23% of all branches

       1.702148350 seconds time elapsed

       0.004868000 seconds user
       0.110786000 seconds sys

instead. Much better.

[ Note! This kernel improvement seems to be very good at triggering a
  race condition in the make jobserver (in GNU make 4.2.1) for me. It's
  a long known bug that was fixed back in June 2017 by GNU make commit
  b552b0525198 ("[SV 51159] Use a non-blocking read with pselect to
  avoid hangs.").

  But there wasn't a new release of GNU make until 4.3 on Jan 19 2020,
  so a number of distributions may still have the buggy version. Some
  have backported the fix to their 4.2.1 release, though, and even
  without the fix it's quite timing-dependent whether the bug actually
  is hit. ]

Josh Triplett says:
 "I've been hammering on your pipe fix patch (switching to exclusive
  wait queues) for a month or so, on several different systems, and I've
  run into no issues with it. The patch *substantially* improves
  parallel build times on large (~100 CPU) systems, both with parallel
  make and with other things that use make's pipe-based jobserver.

  All current distributions (including stable and long-term stable
  distributions) have versions of GNU make that no longer have the
  jobserver bug"
Tested-by: NJosh Triplett <josh@joshtriplett.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0ddad21d

08 2月, 2020 4 次提交

irqchip/gic-v4.1: Set vpe_l1_base for all redistributors · 8b718d40

由 Zenghui Yu 提交于 2月 06, 2020

Currently, we will not set vpe_l1_page for the current RD if we can
inherit the vPE configuration table from another RD (or ITS), which
results in an inconsistency between RDs within the same CommonLPIAff
group.

Let's rename it to vpe_l1_base to indicate the base address of the
vPE configuration table of this RD, and set it properly for *all*
v4.1 redistributors.
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-3-yuzenghui@huawei.com

8b718d40

A
prefix-handling analogues of errorf() and friends · a3ff937b
由 Al Viro 提交于 12月 21, 2019
```
called errorfc/infofc/warnfc/invalfc
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a3ff937b
A
turn fs_param_is_... into functions · 328de528
由 Al Viro 提交于 12月 18, 2019
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
328de528

fs_parse: handle optional arguments sanely · 48ce73b1

由 Al Viro 提交于 12月 17, 2019

Don't bother with "mixed" options that would allow both the
form with and without argument (i.e. both -o foo and -o foo=bar).
Rather than trying to shove both into a single fs_parameter_spec,
allow having with-argument and no-argument specs with the same
name and teach fs_parse to handle that.

There are very few options of that sort, and they are actually
easier to handle that way - callers end up with less postprocessing.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

48ce73b1

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功