提交 · b93fb20f016c057d8a48c4ad62d493dfc5096d0c · openeuler / Kernel

11 9月, 2019 10 次提交

net: lmc: fix spelling mistake "runnin" -> "running" · b93fb20f

由 Colin Ian King 提交于 9月 11, 2019

There is a spelling mistake in the lmc_trace message. Fix it.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b93fb20f

NFC: st95hf: fix spelling mistake "receieve" -> "receive" · 90aa11f1

由 Colin Ian King 提交于 9月 11, 2019

There is a spelling mistake in a dev_err message. Fix it.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90aa11f1

net/rds: An rds_sock is added too early to the hash table · c5c1a030

由 Ka-Cheong Poon 提交于 9月 11, 2019

In rds_bind(), an rds_sock is added to the RDS bind hash table before
rs_transport is set. This means that the socket can be found by the
receive code path when rs_transport is NULL. And the receive code
path de-references rs_transport for congestion update check. This can
cause a panic. An rds_sock should not be added to the bind hash table
before all the needed fields are set.

Reported-by: syzbot+4b4f8163c2e246df3c4c@syzkaller.appspotmail.com
Signed-off-by: NKa-Cheong Poon <ka-cheong.poon@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5c1a030

mac80211: Do not send Layer 2 Update frame before authorization · 3e493173

由 Jouni Malinen 提交于 9月 11, 2019

The Layer 2 Update frame is used to update bridges when a station roams
to another AP even if that STA does not transmit any frames after the
reassociation. This behavior was described in IEEE Std 802.11F-2003 as
something that would happen based on MLME-ASSOCIATE.indication, i.e.,
before completing 4-way handshake. However, this IEEE trial-use
recommended practice document was published before RSN (IEEE Std
802.11i-2004) and as such, did not consider RSN use cases. Furthermore,
IEEE Std 802.11F-2003 was withdrawn in 2006 and as such, has not been
maintained amd should not be used anymore.

Sending out the Layer 2 Update frame immediately after association is
fine for open networks (and also when using SAE, FT protocol, or FILS
authentication when the station is actually authenticated by the time
association completes). However, it is not appropriate for cases where
RSN is used with PSK or EAP authentication since the station is actually
fully authenticated only once the 4-way handshake completes after
authentication and attackers might be able to use the unauthenticated
triggering of Layer 2 Update frame transmission to disrupt bridge
behavior.

Fix this by postponing transmission of the Layer 2 Update frame from
station entry addition to the point when the station entry is marked
authorized. Similarly, send out the VLAN binding update only if the STA
entry has already been authorized.
Signed-off-by: NJouni Malinen <jouni@codeaurora.org>
Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e493173

lib/Kconfig: fix OBJAGG in lib/ menu structure · 3dfdecc6

由 Randy Dunlap 提交于 9月 09, 2019

Keep the "Library routines" menu intact by moving OBJAGG into it.
Otherwise OBJAGG is displayed/presented as an orphan in the
various config menus.

Fixes: 0a020d41 ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Tested-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dfdecc6

net: sonic: replace dev_kfree_skb in sonic_send_packet · 49f6c90b

由 Mao Wenan 提交于 9月 11, 2019

sonic_send_packet will be processed in irq or non-irq
context, so it would better use dev_kfree_skb_any
instead of dev_kfree_skb.

Fixes: d9fb9f38 ("*sonic/natsemi/ns83829: Move the National Semi-conductor drivers")
Signed-off-by: NMao Wenan <maowenan@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49f6c90b

wimax: i2400: fix memory leak · 2507e6ab

由 Navid Emamdoost 提交于 9月 10, 2019

In i2400m_op_rfkill_sw_toggle cmd buffer should be released along with
skb response.
Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2507e6ab

sctp: fix the missing put_user when dumping transport thresholds · f794dc23

由 Xin Long 提交于 9月 09, 2019

This issue causes SCTP_PEER_ADDR_THLDS sockopt not to be able to dump
a transport thresholds info.

Fix it by adding 'goto' put_user in sctp_getsockopt_paddr_thresholds.

Fixes: 8add543e ("sctp: add SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_THLDS sockopt")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f794dc23

sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero · d4d6ec6d

由 Cong Wang 提交于 9月 08, 2019

In case of TCA_HHF_NON_HH_WEIGHT or TCA_HHF_QUANTUM is zero,
it would make no progress inside the loop in hhf_dequeue() thus
kernel would get stuck.

Fix this by checking this corner case in hhf_change().

Fixes: 10239edf ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
Reported-by: syzbot+bc6297c11f19ee807dc2@syzkaller.appspotmail.com
Reported-by: syzbot+041483004a7f45f1f20a@syzkaller.appspotmail.com
Reported-by: syzbot+55be5f513bed37fc4367@syzkaller.appspotmail.com
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Terry Lam <vtlam@google.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4d6ec6d

net_sched: check cops->tcf_block in tc_bind_tclass() · 8b142a00

由 Cong Wang 提交于 9月 08, 2019

At least sch_red and sch_tbf don't implement ->tcf_block()
while still have a non-zero tc "class".

Instead of adding nop implementations to each of such qdisc's,
we can just relax the check of cops->tcf_block() in
tc_bind_tclass(). They don't support TC filter anyway.

Reported-by: syzbot+21b29db13c065852f64b@syzkaller.appspotmail.com
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b142a00

10 9月, 2019 2 次提交

bridge/mdb: remove wrong use of NLM_F_MULTI · 94a72b3f

由 Nicolas Dichtel 提交于 9月 06, 2019

NLM_F_MULTI must be used only when a NLMSG_DONE message is sent at the end.
In fact, NLMSG_DONE is sent only at the end of a dump.

Libraries like libnl will wait forever for NLMSG_DONE.

Fixes: 949f1e39 ("bridge: mdb: notify on router port add and del")
CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94a72b3f

net/ibmvnic: Fix missing { in __ibmvnic_reset · c8dc5595

由 Michal Suchanek 提交于 9月 09, 2019

Commit 1c2977c0 ("net/ibmvnic: free reset work of removed device from queue")
adds a } without corresponding { causing build break.

Fixes: 1c2977c0 ("net/ibmvnic: free reset work of removed device from queue")
Signed-off-by: NMichal Suchanek <msuchanek@suse.de>
Reviewed-by: NTyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: NJuliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8dc5595

08 9月, 2019 1 次提交

nfp: flower: cmsg rtnl locks can timeout reify messages · 28abe579

由 Fred Lotter 提交于 9月 06, 2019

Flower control message replies are handled in different locations. The truly
high priority replies are handled in the BH (tasklet) context, while the
remaining replies are handled in a predefined Linux work queue. The work
queue handler orders replies into high and low priority groups, and always
start servicing the high priority replies within the received batch first.

Reply Type:			Rtnl Lock:	Handler:

CMSG_TYPE_PORT_MOD		no		BH tasklet (mtu)
CMSG_TYPE_TUN_NEIGH		no		BH tasklet
CMSG_TYPE_FLOW_STATS		no		BH tasklet
CMSG_TYPE_PORT_REIFY		no		WQ high
CMSG_TYPE_PORT_MOD		yes		WQ high (link/mtu)
CMSG_TYPE_MERGE_HINT		yes		WQ low
CMSG_TYPE_NO_NEIGH		no		WQ low
CMSG_TYPE_ACTIVE_TUNS		no		WQ low
CMSG_TYPE_QOS_STATS		no		WQ low
CMSG_TYPE_LAG_CONFIG		no		WQ low

A subset of control messages can block waiting for an rtnl lock (from both
work queue priority groups). The rtnl lock is heavily contended for by
external processes such as systemd-udevd, systemd-network and libvirtd,
especially during netdev creation, such as when flower VFs and representors
are instantiated.

Kernel netlink instrumentation shows that external processes (such as
systemd-udevd) often use successive rtnl_trylock() sequences, which can result
in an rtnl_lock() blocked control message to starve for longer periods of time
during rtnl lock contention, i.e. netdev creation.

In the current design a single blocked control message will block the entire
work queue (both priorities), and introduce a latency which is
nondeterministic and dependent on system wide rtnl lock usage.

In some extreme cases, one blocked control message at exactly the wrong time,
just before the maximum number of VFs are instantiated, can block the work
queue for long enough to prevent VF representor REIFY replies from getting
handled in time for the 40ms timeout.

The firmware will deliver the total maximum number of REIFY message replies in
around 300us.

Only REIFY and MTU update messages require replies within a timeout period (of
40ms). The MTU-only updates are already done directly in the BH (tasklet)
handler.

Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts
caused by a blocked work queue waiting on rtnl locks.
Signed-off-by: NFred Lotter <frederik.lotter@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28abe579

07 9月, 2019 7 次提交

net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list · 3dcbdb13

由 Shmulik Ladkani 提交于 9月 06, 2019

Historically, support for frag_list packets entering skb_segment() was
limited to frag_list members terminating on exact same gso_size
boundaries. This is verified with a BUG_ON since commit 89319d38
("net: Add frag_list support to skb_segment"), quote:

    As such we require all frag_list members terminate on exact MSS
    boundaries.  This is checked using BUG_ON.
    As there should only be one producer in the kernel of such packets,
    namely GRO, this requirement should not be difficult to maintain.

However, since commit 6578171a ("bpf: add bpf_skb_change_proto helper"),
the "exact MSS boundaries" assumption no longer holds:
An eBPF program using bpf_skb_change_proto() DOES modify 'gso_size', but
leaves the frag_list members as originally merged by GRO with the
original 'gso_size'. Example of such programs are bpf-based NAT46 or
NAT64.

This lead to a kernel BUG_ON for flows involving:
 - GRO generating a frag_list skb
 - bpf program performing bpf_skb_change_proto() or bpf_skb_adjust_room()
 - skb_segment() of the skb

See example BUG_ON reports in [0].

In commit 13acc94e ("net: permit skb_segment on head_frag frag_list skb"),
skb_segment() was modified to support the "gso_size mangling" case of
a frag_list GRO'ed skb, but *only* for frag_list members having
head_frag==true (having a page-fragment head).

Alas, GRO packets having frag_list members with a linear kmalloced head
(head_frag==false) still hit the BUG_ON.

This commit adds support to skb_segment() for a 'head_skb' packet having
a frag_list whose members are *non* head_frag, with gso_size mangled, by
disabling SG and thus falling-back to copying the data from the given
'head_skb' into the generated segmented skbs - as suggested by Willem de
Bruijn [1].

Since this approach involves the penalty of skb_copy_and_csum_bits()
when building the segments, care was taken in order to enable this
solution only when required:
 - untrusted gso_size, by testing SKB_GSO_DODGY is set
   (SKB_GSO_DODGY is set by any gso_size mangling functions in
    net/core/filter.c)
 - the frag_list is non empty, its item is a non head_frag, *and* the
   headlen of the given 'head_skb' does not match the gso_size.

[0]
https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/

[1]
https://lore.kernel.org/netdev/CA+FuTSfVsgNDi7c=GUU8nMg2hWxF2SjCNLXetHeVPdnxAW5K-w@mail.gmail.com/

Fixes: 6578171a ("bpf: add bpf_skb_change_proto helper")
Suggested-by: NWillem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Reviewed-by: NWillem de Bruijn <willemb@google.com>
Reviewed-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dcbdb13

ipv6: addrconf_f6i_alloc - fix non-null pointer check to !IS_ERR() · 8652f17c

由 Maciej Żenczykowski 提交于 9月 05, 2019

Fixes a stupid bug I recently introduced...
ip6_route_info_create() returns an ERR_PTR(err) and not a NULL on error.

Fixes: d55a2e37 ("net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)'")
Cc: David Ahern <dsahern@gmail.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8652f17c

isdn/capi: check message length in capi_write() · fe163e53

由 Eric Biggers 提交于 9月 05, 2019

syzbot reported:

    BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
    CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ #2
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
      __dump_stack lib/dump_stack.c:77 [inline]
      dump_stack+0x173/0x1d0 lib/dump_stack.c:113
      kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
      __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
      capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
      do_loop_readv_writev fs/read_write.c:703 [inline]
      do_iter_write+0x83e/0xd80 fs/read_write.c:961
      vfs_writev fs/read_write.c:1004 [inline]
      do_writev+0x397/0x840 fs/read_write.c:1039
      __do_sys_writev fs/read_write.c:1112 [inline]
      __se_sys_writev+0x9b/0xb0 fs/read_write.c:1109
      __x64_sys_writev+0x4a/0x70 fs/read_write.c:1109
      do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
      entry_SYSCALL_64_after_hwframe+0x63/0xe7
    [...]

The problem is that capi_write() is reading past the end of the message.
Fix it by checking the message's length in the needed places.

Reported-and-tested-by: syzbot+0849c524d9c634f5ae66@syzkaller.appspotmail.com
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe163e53

net/ibmvnic: free reset work of removed device from queue · 1c2977c0

由 Juliet Kim 提交于 9月 05, 2019

Commit 36f1031c ("ibmvnic: Do not process reset during or after
device removal") made the change to exit reset if the driver has been
removed, but does not free reset work items of the adapter from queue.

Ensure all reset work items are freed when breaking out of the loop early.

Fixes: 36f1031c ("ibmnvic: Do not process reset during or after device removal”)
Signed-off-by: NJuliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c2977c0

net: phylink: Fix flow control resolution · 63b2ed4e

由 Stefan Chulski 提交于 9月 05, 2019

Regarding to IEEE 802.3-2015 standard section 2
28B.3 Priority resolution - Table 28-3 - Pause resolution

In case of Local device Pause=1 AsymDir=0, Link partner
Pause=1 AsymDir=1, Local device resolution should be enable PAUSE
transmit, disable PAUSE receive.
And in case of Local device Pause=1 AsymDir=1, Link partner
Pause=1 AsymDir=0, Local device resolution should be enable PAUSE
receive, disable PAUSE transmit.

Fixes: 9525ae83 ("phylink: add phylink infrastructure")
Signed-off-by: NStefan Chulski <stefanc@marvell.com>
Reported-by: NShaul Ben-Mayor <shaulb@marvell.com>
Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63b2ed4e

net/hamradio/6pack: Fix the size of a sk_buff used in 'sp_bump()' · b82573fd

由 Christophe JAILLET 提交于 8月 26, 2019

We 'allocate' 'count' bytes here. In fact, 'dev_alloc_skb' already add some
extra space for padding, so a bit more is allocated.

However, we use 1 byte for the KISS command, then copy 'count' bytes, so
count+1 bytes.

Explicitly allocate and use 1 more byte to be safe.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b82573fd

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 0c04eb72

由 David S. Miller 提交于 9月 07, 2019

Alexei Starovoitov says:

====================
pull-request: bpf 2019-09-06

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) verifier precision tracking fix, from Alexei.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c04eb72

06 9月, 2019 6 次提交

Merge tag 'wireless-drivers-for-davem-2019-09-05' of... · 74346c43

由 David S. Miller 提交于 9月 06, 2019

Merge tag 'wireless-drivers-for-davem-2019-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 5.3

Fourth set of fixes for 5.3, and hopefully really the last one. Quite
a few CVE fixes this time but at least to my knowledge none of them
have a known exploit.

mt76

* workaround firmware hang by disabling hardware encryption on MT7630E

* disable 5GHz band for MT7630E as it's not working properly

mwifiex

* fix IE parsing to avoid a heap buffer overflow

iwlwifi

* fix for QuZ device initialisation

rt2x00

* another fix for rekeying

* revert a commit causing degradation in rx signal levels

rsi

* fix a double free
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74346c43

MAINTAINERS: add myself as maintainer for xilinx axiethernet driver · b0a3caea

由 Radhey Shyam Pandey 提交于 9月 05, 2019

I am maintaining xilinx axiethernet driver in xilinx tree and would like
to maintain it in the mainline kernel as well. Hence adding myself as a
maintainer. Also Anirudha and John has moved to new roles, so based on
request removing them from the maintainer list.
Signed-off-by: NRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Acked-by: NJohn Linn <john.linn@xilinx.com>
Acked-by: NMichal Simek <michal.simek@xilinx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0a3caea

net: sched: fix reordering issues · b88dd52c

由 Eric Dumazet 提交于 9月 05, 2019

Whenever MQ is not used on a multiqueue device, we experience
serious reordering problems. Bisection found the cited
commit.

The issue can be described this way :

- A single qdisc hierarchy is shared by all transmit queues.
  (eg : tc qdisc replace dev eth0 root fq_codel)

- When/if try_bulk_dequeue_skb_slow() dequeues a packet targetting
  a different transmit queue than the one used to build a packet train,
  we stop building the current list and save the 'bad' skb (P1) in a
  special queue. (bad_txq)

- When dequeue_skb() calls qdisc_dequeue_skb_bad_txq() and finds this
  skb (P1), it checks if the associated transmit queues is still in frozen
  state. If the queue is still blocked (by BQL or NIC tx ring full),
  we leave the skb in bad_txq and return NULL.

- dequeue_skb() calls q->dequeue() to get another packet (P2)

  The other packet can target the problematic queue (that we found
  in frozen state for the bad_txq packet), but another cpu just ran
  TX completion and made room in the txq that is now ready to accept
  new packets.

- Packet P2 is sent while P1 is still held in bad_txq, P1 might be sent
  at next round. In practice P2 is the lead of a big packet train
  (P2,P3,P4 ...) filling the BQL budget and delaying P1 by many packets :/

To solve this problem, we have to block the dequeue process as long
as the first packet in bad_txq can not be sent. Reordering issues
disappear and no side effects have been seen.

Fixes: a53851e2 ("net: sched: explicit locking in gso_cpu fallback")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b88dd52c

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 2e9550ed

由 David S. Miller 提交于 9月 06, 2019

Steffen Klassert says:

====================
pull request (net): ipsec 2019-09-05

1) Several xfrm interface fixes from Nicolas Dichtel:
   - Avoid an interface ID corruption on changelink.
   - Fix wrong intterface names in the logs.
   - Fix a list corruption when changing network namespaces.
   - Fix unregistation of the underying phydev.

2) Fix a potential warning when merging xfrm_plocy nodes.
   From Florian Westphal.

Please pull or let me know if there are problems.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e9550ed

forcedeth: use per cpu to collect xmit/recv statistics · f4b633b9

由 Zhu Yanjun 提交于 9月 05, 2019

When testing with a background iperf pushing 1Gbit/sec traffic and running
both ifconfig and netstat to collect statistics, some deadlocks occurred.

Ifconfig and netstat will call nv_get_stats64 to get software xmit/recv
statistics. In the commit f5d827ae ("forcedeth: implement
ndo_get_stats64() API"), the normal tx/rx variables is to collect tx/rx
statistics. The fix is to replace normal tx/rx variables with per
cpu 64-bit variable to collect xmit/recv statistics. The per cpu variable
will avoid deadlocks and provide fast efficient statistics updates.

In nv_probe, the per cpu variable is initialized. In nv_remove, this
per cpu variable is freed.

In xmit/recv process, this per cpu variable will be updated.

In nv_get_stats64, this per cpu variable on each cpu is added up. Then
the driver can get xmit/recv packets statistics.

A test runs for several days with this commit, the deadlocks disappear
and the performance is better.

Tested:
   - iperf SMP x86_64 ->
   Client connecting to 1.1.1.108, TCP port 5001
   TCP window size: 85.0 KByte (default)
   ------------------------------------------------------------
   [  3] local 1.1.1.105 port 38888 connected with 1.1.1.108 port 5001
   [ ID] Interval       Transfer     Bandwidth
   [  3]  0.0-10.0 sec  1.10 GBytes   943 Mbits/sec

   ifconfig results:

   enp0s9 Link encap:Ethernet  HWaddr 00:21:28:6f:de:0f
          inet addr:1.1.1.105  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5774764531 errors:0 dropped:0 overruns:0 frame:0
          TX packets:633534193 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7646159340904 (7.6 TB) TX bytes:11425340407722 (11.4 TB)

   netstat results:

   Kernel Interface table
   Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
   ...
   enp0s9 1500 0  5774764531 0    0 0      633534193      0      0  0 BMRU
   ...

Fixes: f5d827ae ("forcedeth: implement ndo_get_stats64() API")
CC: Joe Jin <joe.jin@oracle.com>
CC: JUNXIAO_BI <junxiao.bi@oracle.com>
Reported-and-tested-by: NNan san <nan.1986san@gmail.com>
Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4b633b9

net: sonic: return NETDEV_TX_OK if failed to map buffer · 6e1cdedc

由 Mao Wenan 提交于 9月 05, 2019

NETDEV_TX_BUSY really should only be used by drivers that call
netif_tx_stop_queue() at the wrong moment. If dma_map_single() is
failed to map tx DMA buffer, it might trigger an infinite loop.
This patch use NETDEV_TX_OK instead of NETDEV_TX_BUSY, and change
printk to pr_err_ratelimited.

Fixes: d9fb9f38 ("*sonic/natsemi/ns83829: Move the National Semi-conductor drivers")
Signed-off-by: NMao Wenan <maowenan@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e1cdedc

05 9月, 2019 14 次提交

bpf: fix precision tracking of stack slots · 2339cd6c

由 Alexei Starovoitov 提交于 9月 03, 2019

The problem can be seen in the following two tests:
0: (bf) r3 = r10
1: (55) if r3 != 0x7b goto pc+0
2: (7a) *(u64 *)(r3 -8) = 0
3: (79) r4 = *(u64 *)(r10 -8)
..
0: (85) call bpf_get_prandom_u32#7
1: (bf) r3 = r10
2: (55) if r3 != 0x7b goto pc+0
3: (7b) *(u64 *)(r3 -8) = r0
4: (79) r4 = *(u64 *)(r10 -8)

When backtracking need to mark R4 it will mark slot fp-8.
But ST or STX into fp-8 could belong to the same block of instructions.
When backtracing is done the parent state may have fp-8 slot
as "unallocated stack". Which will cause verifier to warn
and incorrectly reject such programs.

Writes into stack via non-R10 register are rare. llvm always
generates canonical stack spill/fill.
For such pathological case fall back to conservative precision
tracking instead of rejecting.

Reported-by: syzbot+c8d66267fd2b5955287e@syzkaller.appspotmail.com
Fixes: b5dc0163 ("bpf: precise scalar_value tracking")
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

2339cd6c

net: Properly update v4 routes with v6 nexthop · 7bdf4de1

由 Donald Sharp 提交于 9月 04, 2019

When creating a v4 route that uses a v6 nexthop from a nexthop group.
Allow the kernel to properly send the nexthop as v6 via the RTA_VIA
attribute.

Broken behavior:

$ ip nexthop add via fe80::9 dev eth0
$ ip nexthop show
id 1 via fe80::9 dev eth0 scope link
$ ip route add 4.5.6.7/32 nhid 1
$ ip route show
default via 10.0.2.2 dev eth0
4.5.6.7 nhid 1 via 254.128.0.0 dev eth0
10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
$

Fixed behavior:

$ ip nexthop add via fe80::9 dev eth0
$ ip nexthop show
id 1 via fe80::9 dev eth0 scope link
$ ip route add 4.5.6.7/32 nhid 1
$ ip route show
default via 10.0.2.2 dev eth0
4.5.6.7 nhid 1 via inet6 fe80::9 dev eth0
10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
$

v2, v3: Addresses code review comments from David Ahern

Fixes: dcb1ecb5 (“ipv4: Prepare for fib6_nh from a nexthop object”)
Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7bdf4de1

Merge branch 'nexthops-Fix-multipath-notifications-for-IPv6-and-selftests' · e9752c83

由 David S. Miller 提交于 9月 05, 2019

David Ahern says:

====================
nexthops: Fix multipath notifications for IPv6 and selftests

A couple of bug fixes noticed while testing Donald's patch.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9752c83

selftest: A few cleanups for fib_nexthops.sh · 91bfb564

由 David Ahern 提交于 9月 03, 2019

Cleanups of the tests in fib_nexthops.sh
1. Several tests noted unexpected route output, but the
   discrepancy was not showing in the summary output and
   overlooked in the verbose output. Add a WARNING message
   to the summary output to make it clear a test is not showing
   expected output.

2. Several check_* calls are missing extra data like scope and metric
   causing mismatches when the nexthops or routes are correct - some of
   them are a side effect of the evolving iproute2 command. Update the
   data to the expected output.

3. Several check_routes are checking for the wrong nexthop data,
   most likely a copy-paste-update error.

4. A couple of tests were re-using a nexthop id that already existed.
   Fix those to use a new id.

Fixes: 6345266a ("selftests: Add test cases for nexthop objects")
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91bfb564

ipv6: Fix RTA_MULTIPATH with nexthop objects · 4255ff05

由 David Ahern 提交于 9月 03, 2019

A change to the core nla helpers was missed during the push of
the nexthop changes. rt6_fill_node_nexthop should be calling
nla_nest_start_noflag not nla_nest_start. Currently, iproute2
does not print multipath data because of parsing issues with
the attribute.

Fixes: f88d8ea6 ("ipv6: Plumb support for nexthop object in a fib6_info")
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4255ff05

net: sock_map, fix missing ulp check in sock hash case · 44580a01

由 John Fastabend 提交于 9月 03, 2019

sock_map and ULP only work together when ULP is loaded after the sock
map is loaded. In the sock_map case we added a check for this to fail
the load if ULP is already set. However, we missed the check on the
sock_hash side.

Add a ULP check to the sock_hash update path.

Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
Reported-by: syzbot+7a6ee4d0078eac6bf782@syzkaller.appspotmail.com
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44580a01

net: fixed_phy: Add forward declaration for struct gpio_desc; · ebe26aca

由 Moritz Fischer 提交于 9月 03, 2019

Add forward declaration for struct gpio_desc in order to address
the following:

./include/linux/phy_fixed.h:48:17: error: 'struct gpio_desc' declared inside parameter list [-Werror]
./include/linux/phy_fixed.h:48:17: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]

Fixes: 71bd106d ("net: fixed-phy: Add fixed_phy_register_with_gpiod() API")
Signed-off-by: NMoritz Fischer <mdf@kernel.org>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebe26aca

tipc: add NULL pointer check before calling kfree_rcu · 42dec1db

由 Xin Long 提交于 9月 03, 2019

Unlike kfree(p), kfree_rcu(p, rcu) won't do NULL pointer check. When
tipc_nametbl_remove_publ returns NULL, the panic below happens:

   BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
   RIP: 0010:__call_rcu+0x1d/0x290
   Call Trace:
    <IRQ>
    tipc_publ_notify+0xa9/0x170 [tipc]
    tipc_node_write_unlock+0x8d/0x100 [tipc]
    tipc_node_link_down+0xae/0x1d0 [tipc]
    tipc_node_check_dest+0x3ea/0x8f0 [tipc]
    ? tipc_disc_rcv+0x2c7/0x430 [tipc]
    tipc_disc_rcv+0x2c7/0x430 [tipc]
    ? tipc_rcv+0x6bb/0xf20 [tipc]
    tipc_rcv+0x6bb/0xf20 [tipc]
    ? ip_route_input_slow+0x9cf/0xb10
    tipc_udp_recv+0x195/0x1e0 [tipc]
    ? tipc_udp_is_known_peer+0x80/0x80 [tipc]
    udp_queue_rcv_skb+0x180/0x460
    udp_unicast_rcv_skb.isra.56+0x75/0x90
    __udp4_lib_rcv+0x4ce/0xb90
    ip_local_deliver_finish+0x11c/0x210
    ip_local_deliver+0x6b/0xe0
    ? ip_rcv_finish+0xa9/0x410
    ip_rcv+0x273/0x362

Fixes: 97ede29e ("tipc: convert name table read-write lock to RCU")
Reported-by: NLi Shuang <shuali@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42dec1db

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 6a87691c

由 David S. Miller 提交于 9月 05, 2019

Johan Hedberg says:

====================
pull request: bluetooth 2019-09-05

Here are a few more Bluetooth fixes for 5.3. I hope they can still make
it. There's one USB ID addition for btusb, two reverts due to discovered
regressions, and two other important fixes.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a87691c

Revert "Bluetooth: validate BLE connection interval updates" · 68d19d7d

由 Marcel Holtmann 提交于 9月 04, 2019

This reverts commit c49a8682.

There are devices which require low connection intervals for usable operation
including keyboards and mice. Forcing a static connection interval for
these types of devices has an impact in latency and causes a regression.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>

68d19d7d

net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others) · d55a2e37

由 Maciej Żenczykowski 提交于 9月 02, 2019

There is a subtle change in behaviour introduced by:
  commit c7a1ce39
  'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'

Before that patch /proc/net/ipv6_route includes:
00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo

Afterwards /proc/net/ipv6_route includes:
00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo

ie. the above commit causes the ::1/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).

AFAICT, this is incorrect since these routes are *not* coming from RA's.

As such, this patch restores the old behaviour.

Fixes: c7a1ce39 ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create")
Cc: David Ahern <dsahern@gmail.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d55a2e37

sctp: use transport pf_retrans in sctp_do_8_2_transport_strike · 10eb56c5

由 Xin Long 提交于 9月 02, 2019

Transport should use its own pf_retrans to do the error_count
check, instead of asoc's. Otherwise, it's meaningless to make
pf_retrans per transport.

Fixes: 5aa93bcf ("sctp: Implement quick failover draft from tsvwg")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10eb56c5

rxrpc: Fix misplaced traceline · 59132894

由 David Howells 提交于 9月 02, 2019

There's a misplaced traceline in rxrpc_input_packet() which is looking at a
packet that just got released rather than the replacement packet.

Fix this by moving the traceline after the assignment that moves the new
packet pointer to the actual packet pointer.

Fixes: d0d5c0cd ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
Reported-by: NHillf Danton <hdanton@sina.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59132894

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · d471c6f7

由 David S. Miller 提交于 9月 05, 2019

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) br_netfilter drops IPv6 packets if ipv6 is disabled, from Leonardo Bras.

2) nft_socket hits BUG() due to illegal skb->sk caching, patch from
   Fernando Fernandez Mancera.

3) nft_fib_netdev could be called with ipv6 disabled, leading to crash
   in the fib lookup, also from Leonardo.

4) ctnetlink honors IPS_OFFLOAD flag, just like nf_conntrack sysctl does.

5) Properly set up flowtable entry timeout, otherwise immediate
   removal by garbage collector might occur.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d471c6f7

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功