1. 24 4月, 2020 3 次提交
  2. 23 4月, 2020 23 次提交
    • D
      ipv4: Update fib_select_default to handle nexthop objects · 7c74b0be
      David Ahern 提交于
      A user reported [0] hitting the WARN_ON in fib_info_nh:
      
          [ 8633.839816] ------------[ cut here ]------------
          [ 8633.839819] WARNING: CPU: 0 PID: 1719 at include/net/nexthop.h:251 fib_select_path+0x303/0x381
          ...
          [ 8633.839846] RIP: 0010:fib_select_path+0x303/0x381
          ...
          [ 8633.839848] RSP: 0018:ffffb04d407f7d00 EFLAGS: 00010286
          [ 8633.839850] RAX: 0000000000000000 RBX: ffff9460b9897ee8 RCX: 00000000000000fe
          [ 8633.839851] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000
          [ 8633.839852] RBP: ffff946076049850 R08: 0000000059263a83 R09: ffff9460840e4000
          [ 8633.839853] R10: 0000000000000014 R11: 0000000000000000 R12: ffffb04d407f7dc0
          [ 8633.839854] R13: ffffffffa4ce3240 R14: 0000000000000000 R15: ffff9460b7681f60
          [ 8633.839857] FS:  00007fcac2e02700(0000) GS:ffff9460bdc00000(0000) knlGS:0000000000000000
          [ 8633.839858] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          [ 8633.839859] CR2: 00007f27beb77e28 CR3: 0000000077734000 CR4: 00000000000006f0
          [ 8633.839867] Call Trace:
          [ 8633.839871]  ip_route_output_key_hash_rcu+0x421/0x890
          [ 8633.839873]  ip_route_output_key_hash+0x5e/0x80
          [ 8633.839876]  ip_route_output_flow+0x1a/0x50
          [ 8633.839878]  __ip4_datagram_connect+0x154/0x310
          [ 8633.839880]  ip4_datagram_connect+0x28/0x40
          [ 8633.839882]  __sys_connect+0xd6/0x100
          ...
      
      The WARN_ON is triggered in fib_select_default which is invoked when
      there are multiple default routes. Update the function to use
      fib_info_nhc and convert the nexthop checks to use fib_nh_common.
      
      Add test case that covers the affected code path.
      
      [0] https://github.com/FRRouting/frr/issues/6089
      
      Fixes: 493ced1a ("ipv4: Allow routes to use nexthop objects")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c74b0be
    • S
      netlabel: Kconfig: Update reference for NetLabel Tools project · c0259664
      Salvatore Bonaccorso 提交于
      The NetLabel Tools project has moved from http://netlabel.sf.net to a
      GitHub project. Update to directly refer to the new home for the tools.
      Signed-off-by: NSalvatore Bonaccorso <carnil@debian.org>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0259664
    • I
      MAINTAINERS: update dpaa2-eth maintainer list · 31fa51ad
      Ioana Ciornei 提交于
      Add myself as another maintainer of dpaa2-eth.
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31fa51ad
    • P
      mptcp: fix data_fin handing in RX path · 9a19371b
      Paolo Abeni 提交于
      The data fin flag is set only via a DSS option, but
      mptcp_incoming_options() copies it unconditionally from the
      provided RX options.
      
      Since we do not clear all the mptcp sock RX options in a
      socket free/alloc cycle, we can end-up with a stray data_fin
      value while parsing e.g. MPC packets.
      
      That would lead to mapping data corruption and will trigger
      a few WARN_ON() in the RX path.
      
      Instead of adding a costly memset(), fetch the data_fin flag
      only for DSS packets - when we always explicitly initialize
      such bit at option parsing time.
      
      Fixes: 648ef4b8 ("mptcp: Implement MPTCP receive path")
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a19371b
    • D
      vrf: Fix IPv6 with qdisc and xfrm · a53c1028
      David Ahern 提交于
      When a qdisc is attached to the VRF device, the packet goes down the ndo
      xmit function which is setup to send the packet back to the VRF driver
      which does a lookup to send the packet out. The lookup in the VRF driver
      is not considering xfrm policies. Change it to use ip6_dst_lookup_flow
      rather than ip6_route_output.
      
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a53c1028
    • S
      Documentation: add documentation of ping_group_range · 5cc4adbc
      Stephen Hemminger 提交于
      Support for non-root users to send ICMP ECHO requests was added
      back in Linux 3.0 kernel, but the documentation for the sysctl
      to enable it has been missing.
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cc4adbc
    • D
      Merge branch 'sctp-fixes' · 609120c5
      David S. Miller 提交于
      Jere Leppänen says:
      
      ====================
      sctp: Fix problems with peer restart when in SHUTDOWN-PENDING state and socket is closed
      
      These patches are related to the scenario described in commit
      bdf6fa52 ("sctp: handle association restarts when the socket is
      closed."). To recap, when our association is in SHUTDOWN-PENDING state
      and we've closed our one-to-one socket, while the peer crashes without
      being detected, restarts and reconnects using the same addresses and
      ports, we start association shutdown.
      
      In this case, Cumulative TSN Ack in the SHUTDOWN that we send has
      always been incorrect. Additionally, bundling of the SHUTDOWN with the
      COOKIE-ACK was broken by a later commit. This series fixes both of
      these issues.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      609120c5
    • J
      sctp: Fix SHUTDOWN CTSN Ack in the peer restart case · 12dfd78e
      Jere Leppänen 提交于
      When starting shutdown in sctp_sf_do_dupcook_a(), get the value for
      SHUTDOWN Cumulative TSN Ack from the new association, which is
      reconstructed from the cookie, instead of the old association, which
      the peer doesn't have anymore.
      
      Otherwise the SHUTDOWN is either ignored or replied to with an ABORT
      by the peer because CTSN Ack doesn't match the peer's Initial TSN.
      
      Fixes: bdf6fa52 ("sctp: handle association restarts when the socket is closed.")
      Signed-off-by: NJere Leppänen <jere.leppanen@nokia.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12dfd78e
    • J
      sctp: Fix bundling of SHUTDOWN with COOKIE-ACK · 145cb2f7
      Jere Leppänen 提交于
      When we start shutdown in sctp_sf_do_dupcook_a(), we want to bundle
      the SHUTDOWN with the COOKIE-ACK to ensure that the peer receives them
      at the same time and in the correct order. This bundling was broken by
      commit 4ff40b86 ("sctp: set chunk transport correctly when it's a
      new asoc"), which assigns a transport for the COOKIE-ACK, but not for
      the SHUTDOWN.
      
      Fix this by passing a reference to the COOKIE-ACK chunk as an argument
      to sctp_sf_do_9_2_start_shutdown() and onward to
      sctp_make_shutdown(). This way the SHUTDOWN chunk is assigned the same
      transport as the COOKIE-ACK chunk, which allows them to be bundled.
      
      In sctp_sf_do_9_2_start_shutdown(), the void *arg parameter was
      previously unused. Now that we're taking it into use, it must be a
      valid pointer to a chunk, or NULL. There is only one call site where
      it's not, in sctp_sf_autoclose_timer_expire(). Fix that too.
      
      Fixes: 4ff40b86 ("sctp: set chunk transport correctly when it's a new asoc")
      Signed-off-by: NJere Leppänen <jere.leppanen@nokia.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      145cb2f7
    • V
      net: dsa: don't fail to probe if we couldn't set the MTU · 72579e14
      Vladimir Oltean 提交于
      There is no reason to fail the probing of the switch if the MTU couldn't
      be configured correctly (either the switch port itself, or the host
      port) for whatever reason. MTU-sized traffic probably won't work, sure,
      but we can still probably limp on and support some form of communication
      anyway, which the users would probably appreciate more.
      
      Fixes: bfcb8132 ("net: dsa: configure the MTU for switch ports")
      Reported-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72579e14
    • E
      sched: etf: do not assume all sockets are full blown · a1211bf9
      Eric Dumazet 提交于
      skb->sk does not always point to a full blown socket,
      we need to use sk_fullsock() before accessing fields which
      only make sense on full socket.
      
      BUG: KASAN: use-after-free in report_sock_error+0x286/0x300 net/sched/sch_etf.c:141
      Read of size 1 at addr ffff88805eb9b245 by task syz-executor.5/9630
      
      CPU: 1 PID: 9630 Comm: syz-executor.5 Not tainted 5.7.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x188/0x20d lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd3/0x315 mm/kasan/report.c:382
       __kasan_report.cold+0x35/0x4d mm/kasan/report.c:511
       kasan_report+0x33/0x50 mm/kasan/common.c:625
       report_sock_error+0x286/0x300 net/sched/sch_etf.c:141
       etf_enqueue_timesortedlist+0x389/0x740 net/sched/sch_etf.c:170
       __dev_xmit_skb net/core/dev.c:3710 [inline]
       __dev_queue_xmit+0x154a/0x30a0 net/core/dev.c:4021
       neigh_hh_output include/net/neighbour.h:499 [inline]
       neigh_output include/net/neighbour.h:508 [inline]
       ip6_finish_output2+0xfb5/0x25b0 net/ipv6/ip6_output.c:117
       __ip6_finish_output+0x442/0xab0 net/ipv6/ip6_output.c:143
       ip6_finish_output+0x34/0x1f0 net/ipv6/ip6_output.c:153
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip6_output+0x239/0x810 net/ipv6/ip6_output.c:176
       dst_output include/net/dst.h:435 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       NF_HOOK include/linux/netfilter.h:301 [inline]
       ip6_xmit+0xe1a/0x2090 net/ipv6/ip6_output.c:280
       tcp_v6_send_synack+0x4e7/0x960 net/ipv6/tcp_ipv6.c:521
       tcp_rtx_synack+0x10d/0x1a0 net/ipv4/tcp_output.c:3916
       inet_rtx_syn_ack net/ipv4/inet_connection_sock.c:669 [inline]
       reqsk_timer_handler+0x4c2/0xb40 net/ipv4/inet_connection_sock.c:763
       call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1405
       expire_timers kernel/time/timer.c:1450 [inline]
       __run_timers kernel/time/timer.c:1774 [inline]
       __run_timers kernel/time/timer.c:1741 [inline]
       run_timer_softirq+0x623/0x1600 kernel/time/timer.c:1787
       __do_softirq+0x26c/0x9f7 kernel/softirq.c:292
       invoke_softirq kernel/softirq.c:373 [inline]
       irq_exit+0x192/0x1d0 kernel/softirq.c:413
       exiting_irq arch/x86/include/asm/apic.h:546 [inline]
       smp_apic_timer_interrupt+0x19e/0x600 arch/x86/kernel/apic/apic.c:1140
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
       </IRQ>
      RIP: 0010:des_encrypt+0x157/0x9c0 lib/crypto/des.c:792
      Code: 85 22 06 00 00 41 31 dc 41 8b 4d 04 44 89 e2 41 83 e4 3f 4a 8d 3c a5 60 72 72 88 81 e2 3f 3f 3f 3f 48 89 f8 48 c1 e8 03 31 d9 <0f> b6 34 28 48 89 f8 c1 c9 04 83 e0 07 83 c0 03 40 38 f0 7c 09 40
      RSP: 0018:ffffc90003b5f6c0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
      RAX: 1ffffffff10e4e55 RBX: 00000000d2f846d0 RCX: 00000000d2f846d0
      RDX: 0000000012380612 RSI: ffffffff839863ca RDI: ffffffff887272a8
      RBP: dffffc0000000000 R08: ffff888091d0a380 R09: 0000000000800081
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000012
      R13: ffff8880a8ae8078 R14: 00000000c545c93e R15: 0000000000000006
       cipher_crypt_one crypto/cipher.c:75 [inline]
       crypto_cipher_encrypt_one+0x124/0x210 crypto/cipher.c:82
       crypto_cbcmac_digest_update+0x1b5/0x250 crypto/ccm.c:830
       crypto_shash_update+0xc4/0x120 crypto/shash.c:119
       shash_ahash_update+0xa3/0x110 crypto/shash.c:246
       crypto_ahash_update include/crypto/hash.h:547 [inline]
       hash_sendmsg+0x518/0xad0 crypto/algif_hash.c:102
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x308/0x7e0 net/socket.c:2362
       ___sys_sendmsg+0x100/0x170 net/socket.c:2416
       __sys_sendmmsg+0x195/0x480 net/socket.c:2506
       __do_sys_sendmmsg net/socket.c:2535 [inline]
       __se_sys_sendmmsg net/socket.c:2532 [inline]
       __x64_sys_sendmmsg+0x99/0x100 net/socket.c:2532
       do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      RIP: 0033:0x45c829
      Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f6d9528ec78 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 00000000004fc080 RCX: 000000000045c829
      RDX: 0000000000000001 RSI: 0000000020002640 RDI: 0000000000000004
      RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000008d7 R14: 00000000004cb7aa R15: 00007f6d9528f6d4
      
      Fixes: 4b15c707 ("net/sched: Make etf report drops on error_queue")
      Fixes: 25db26a9 ("net/sched: Introduce the ETF Qdisc")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Vinicius Costa Gomes <vinicius.gomes@intel.com>
      Reviewed-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1211bf9
    • D
      selftests: Fix suppress test in fib_tests.sh · 2c1dd4c1
      David Ahern 提交于
      fib_tests is spewing errors:
          ...
          Cannot open network namespace "ns1": No such file or directory
          Cannot open network namespace "ns1": No such file or directory
          Cannot open network namespace "ns1": No such file or directory
          Cannot open network namespace "ns1": No such file or directory
          ping: connect: Network is unreachable
          Cannot open network namespace "ns1": No such file or directory
          Cannot open network namespace "ns1": No such file or directory
          ...
      
      Each test entry in fib_tests is supposed to do its own setup and
      cleanup. Right now the $IP commands in fib_suppress_test are
      failing because there is no ns1. Add the setup/cleanup and logging
      expected for each test.
      
      Fixes: ca7a03c4 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c1dd4c1
    • D
      Merge branch 'net-dsa-b53-Various-ARL-fixes' · d5812a86
      David S. Miller 提交于
      Florian Fainelli says:
      
      ====================
      net: dsa: b53: Various ARL fixes
      
      This patch series fixes a number of short comings in the existing b53
      driver ARL management logic in particular:
      
      - we were not looking up the {MAC,VID} tuples against their VID, despite
        having VLANs enabled
      
      - the MDB entries (multicast) would lose their validity as soon as a
        single port in the vector would leave the entry
      
      - the ARL was currently under utilized because we would always place new
        entries in bin index #1, instead of using all possible bins available,
        thus reducing the ARL effective size by 50% or 75% depending on the
        switch generation
      
      - it was possible to overwrite the ARL entries because no proper space
        verification was done
      
      This patch series addresses all of these issues.
      
      Changes in v2:
      - added a new patch to correctly flip invidual VLAN learning vs. shared
        VLAN learning depending on the global VLAN state
      
      - added Andrew's R-b tags for patches which did not change
      
      - corrected some verbosity and minor issues in patch #4 to match caller
        expectations, also avoid a variable length DECLARE_BITMAP() call
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5812a86
    • F
      net: dsa: b53: b53_arl_rw_op() needs to select IVL or SVL · 64fec949
      Florian Fainelli 提交于
      Flip the IVL_SVL_SELECT bit correctly based on the VLAN enable status,
      the default is to perform Shared VLAN learning instead of Individual
      learning.
      
      Fixes: 1da6df85 ("net: dsa: b53: Implement ARL add/del/dump operations")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64fec949
    • F
      net: dsa: b53: Rework ARL bin logic · 6344dbde
      Florian Fainelli 提交于
      When asking the ARL to read a MAC address, we will get a number of bins
      returned in a single read. Out of those bins, there can essentially be 3
      states:
      
      - all bins are full, we have no space left, and we can either replace an
        existing address or return that full condition
      
      - the MAC address was found, then we need to return its bin index and
        modify that one, and only that one
      
      - the MAC address was not found and we have a least one bin free, we use
        that bin index location then
      
      The code would unfortunately fail on all counts.
      
      Fixes: 1da6df85 ("net: dsa: b53: Implement ARL add/del/dump operations")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6344dbde
    • F
      net: dsa: b53: Fix ARL register definitions · c2e77a18
      Florian Fainelli 提交于
      The ARL {MAC,VID} tuple and the forward entry were off by 0x10 bytes,
      which means that when we read/wrote from/to ARL bin index 0, we were
      actually accessing the ARLA_RWCTRL register.
      
      Fixes: 1da6df85 ("net: dsa: b53: Implement ARL add/del/dump operations")
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2e77a18
    • F
      net: dsa: b53: Fix valid setting for MDB entries · eab167f4
      Florian Fainelli 提交于
      When support for the MDB entries was added, the valid bit was correctly
      changed to be assigned depending on the remaining port bitmask, that is,
      if there were no more ports added to the entry's port bitmask, the entry
      now becomes invalid. There was another assignment a few lines below that
      would override this which would invalidate entries even when there were
      still multiple ports left in the MDB entry.
      
      Fixes: 5d65b64a ("net: dsa: b53: Add support for MDB")
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eab167f4
    • F
      net: dsa: b53: Lookup VID in ARL searches when VLAN is enabled · 2e97b0cd
      Florian Fainelli 提交于
      When VLAN is enabled, and an ARL search is issued, we also need to
      compare the full {MAC,VID} tuple before returning a successful search
      result.
      
      Fixes: 1da6df85 ("net: dsa: b53: Implement ARL add/del/dump operations")
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e97b0cd
    • D
      Merge branch 'vrf-looping' · 87f78f27
      David S. Miller 提交于
      David Ahern says:
      
      ====================
      net: Fix looping with vrf, xfrms and qdisc on VRF
      
      Trev reported that use of VRFs with xfrms is looping when a qdisc
      is added to the VRF device. The combination of xfrm + qdisc is not
      handled by the VRF driver which lost track that it has already
      seen the packet.
      
      The XFRM_TRANSFORMED flag is used by the netfilter code for a similar
      purpose, so re-use for VRF. Patch 1 drops the #ifdef around setting
      the flag in the xfrm output functions. Patch 2 adds a check to
      the VRF driver for flag; if set the packet has already passed through
      the VRF driver once and does not need to recirculated a second time.
      
      This is a day 1 bug with VRFs; stable wise, I would only take this
      back to 4.14. I have a set of test cases which I will submit to
      net-next.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87f78f27
    • D
      vrf: Check skb for XFRM_TRANSFORMED flag · 16b9db1c
      David Ahern 提交于
      To avoid a loop with qdiscs and xfrms, check if the skb has already gone
      through the qdisc attached to the VRF device and then to the xfrm layer.
      If so, no need for a second redirect.
      
      Fixes: 193125db ("net: Introduce VRF device driver")
      Reported-by: NTrev Larock <trev@larock.ca>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16b9db1c
    • D
      xfrm: Always set XFRM_TRANSFORMED in xfrm{4,6}_output_finish · 0c922a48
      David Ahern 提交于
      IPSKB_XFRM_TRANSFORMED and IP6SKB_XFRM_TRANSFORMED are skb flags set by
      xfrm code to tell other skb handlers that the packet has been passed
      through the xfrm output functions. Simplify the code and just always
      set them rather than conditionally based on netfilter enabled thus
      making the flag available for other users.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c922a48
    • M
      ipv6: ndisc: RFC-ietf-6man-ra-pref64-09 is now published as RFC8781 · 9175d3f3
      Maciej Żenczykowski 提交于
      See:
        https://www.rfc-editor.org/authors/rfc8781.txt
      
      Cc: Erik Kline <ek@google.com>
      Cc: Jen Linkova <furry@google.com>
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Cc: Michael Haro <mharo@google.com>
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Fixes: c24a77ed ("ipv6: ndisc: add support for 'PREF64' dns64 prefix identifier")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9175d3f3
    • Y
      net: phy: microchip_t1: add lan87xx_phy_init to initialize the lan87xx phy. · 63edbcce
      Yuiko Oshino 提交于
      lan87xx_phy_init() initializes the lan87xx phy hardware
      including its TC10 Wake-up and Sleep features.
      
      Fixes: 3e50d2da ("Add driver for Microchip LAN87XX T1 PHYs")
      Signed-off-by: NYuiko Oshino <yuiko.oshino@microchip.com>
      v0->v1:
          - Add more details in the commit message and source comments.
          - Update to the latest initialization sequences.
          - Add access_ereg_modify_changed().
          - Fix access_ereg() to access SMI bank correctly.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63edbcce
  3. 22 4月, 2020 8 次提交
    • V
      net: stmmac: Enable SERDES power up/down sequence · b9663b7c
      Voon Weifeng 提交于
      This patch is to enable Intel SERDES power up/down sequence. The SERDES
      converts 8/10 bits data to SGMII signal. Below is an example of
      HW configuration for SGMII mode. The SERDES is located in the PHY IF
      in the diagram below.
      
      <-----------------GBE Controller---------->|<--External PHY chip-->
      +----------+         +----+            +---+           +----------+
      |   EQoS   | <-GMII->| DW | < ------ > |PHY| <-SGMII-> | External |
      |   MAC    |         |xPCS|            |IF |           | PHY      |
      +----------+         +----+            +---+           +----------+
             ^               ^                 ^                ^
             |               |                 |                |
             +---------------------MDIO-------------------------+
      
      PHY IF configuration and status registers are accessible through
      mdio address 0x15 which is defined as mdio_adhoc_addr. During D0,
      The driver will need to power up PHY IF by changing the power state
      to P0. Likewise, for D3, the driver sets PHY IF power state to P3.
      Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com>
      Signed-off-by: NOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9663b7c
    • D
      net: broadcom: convert to devm_platform_ioremap_resource_byname() · d7a5502b
      Dejin Zheng 提交于
      Use the function devm_platform_ioremap_resource_byname() to simplify
      source code which calls the functions platform_get_resource_byname()
      and devm_ioremap_resource(). Remove also a few error messages which
      became unnecessary with this software refactoring.
      Suggested-by: NMarkus Elfring <Markus.Elfring@web.de>
      Signed-off-by: NDejin Zheng <zhengdejin5@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7a5502b
    • T
      macvlan: fix null dereference in macvlan_device_event() · 4dee15b4
      Taehee Yoo 提交于
      In the macvlan_device_event(), the list_first_entry_or_null() is used.
      This function could return null pointer if there is no node.
      But, the macvlan module doesn't check the null pointer.
      So, null-ptr-deref would occur.
      
            bond0
              |
         +----+-----+
         |          |
      macvlan0   macvlan1
         |          |
       dummy0     dummy1
      
      The problem scenario.
      If dummy1 is removed,
      1. ->dellink() of dummy1 is called.
      2. NETDEV_UNREGISTER of dummy1 notification is sent to macvlan module.
      3. ->dellink() of macvlan1 is called.
      4. NETDEV_UNREGISTER of macvlan1 notification is sent to bond module.
      5. __bond_release_one() is called and it internally calls
         dev_set_mac_address().
      6. dev_set_mac_address() calls the ->ndo_set_mac_address() of macvlan1,
         which is macvlan_set_mac_address().
      7. macvlan_set_mac_address() calls the dev_set_mac_address() with dummy1.
      8. NETDEV_CHANGEADDR of dummy1 is sent to macvlan module.
      9. In the macvlan_device_event(), it calls list_first_entry_or_null().
      At this point, dummy1 and macvlan1 were removed.
      So, list_first_entry_or_null() will return NULL.
      
      Test commands:
          ip netns add nst
          ip netns exec nst ip link add bond0 type bond
          for i in {0..10}
          do
              ip netns exec nst ip link add dummy$i type dummy
      	ip netns exec nst ip link add macvlan$i link dummy$i \
      		type macvlan mode passthru
      	ip netns exec nst ip link set macvlan$i master bond0
          done
          ip netns del nst
      
      Splat looks like:
      [   40.585687][  T146] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEI
      [   40.587249][  T146] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      [   40.588342][  T146] CPU: 1 PID: 146 Comm: kworker/u8:2 Not tainted 5.7.0-rc1+ #532
      [   40.589299][  T146] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   40.590469][  T146] Workqueue: netns cleanup_net
      [   40.591045][  T146] RIP: 0010:macvlan_device_event+0x4e2/0x900 [macvlan]
      [   40.591905][  T146] Code: 00 00 00 00 00 fc ff df 80 3c 06 00 0f 85 45 02 00 00 48 89 da 48 b8 00 00 00 00 00 fc ff d2
      [   40.594126][  T146] RSP: 0018:ffff88806116f4a0 EFLAGS: 00010246
      [   40.594783][  T146] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [   40.595653][  T146] RDX: 0000000000000000 RSI: ffff88806547ddd8 RDI: ffff8880540f1360
      [   40.596495][  T146] RBP: ffff88804011a808 R08: fffffbfff4fb8421 R09: fffffbfff4fb8421
      [   40.597377][  T146] R10: ffffffffa7dc2107 R11: 0000000000000000 R12: 0000000000000008
      [   40.598186][  T146] R13: ffff88804011a000 R14: ffff8880540f1000 R15: 1ffff1100c22de9a
      [   40.599012][  T146] FS:  0000000000000000(0000) GS:ffff888067800000(0000) knlGS:0000000000000000
      [   40.600004][  T146] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   40.600665][  T146] CR2: 00005572d3a807b8 CR3: 000000005fcf4003 CR4: 00000000000606e0
      [   40.601485][  T146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   40.602461][  T146] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   40.603443][  T146] Call Trace:
      [   40.603871][  T146]  ? nf_tables_dump_setelem+0xa0/0xa0 [nf_tables]
      [   40.604587][  T146]  ? macvlan_uninit+0x100/0x100 [macvlan]
      [   40.605212][  T146]  ? __module_text_address+0x13/0x140
      [   40.605842][  T146]  notifier_call_chain+0x90/0x160
      [   40.606477][  T146]  dev_set_mac_address+0x28e/0x3f0
      [   40.607117][  T146]  ? netdev_notify_peers+0xc0/0xc0
      [   40.607762][  T146]  ? __module_text_address+0x13/0x140
      [   40.608440][  T146]  ? notifier_call_chain+0x90/0x160
      [   40.609097][  T146]  ? dev_set_mac_address+0x1f0/0x3f0
      [   40.609758][  T146]  dev_set_mac_address+0x1f0/0x3f0
      [   40.610402][  T146]  ? __local_bh_enable_ip+0xe9/0x1b0
      [   40.611071][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
      [   40.611823][  T146]  ? netdev_notify_peers+0xc0/0xc0
      [   40.612461][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
      [   40.613213][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
      [   40.613963][  T146]  ? __local_bh_enable_ip+0xe9/0x1b0
      [   40.614631][  T146]  ? bond_time_in_interval.isra.31+0x90/0x90 [bonding]
      [   40.615484][  T146]  ? __bond_release_one+0x9f0/0x12c0 [bonding]
      [   40.616230][  T146]  __bond_release_one+0x9f0/0x12c0 [bonding]
      [   40.616949][  T146]  ? bond_enslave+0x47c0/0x47c0 [bonding]
      [   40.617642][  T146]  ? lock_downgrade+0x730/0x730
      [   40.618218][  T146]  ? check_flags.part.42+0x450/0x450
      [   40.618850][  T146]  ? __mutex_unlock_slowpath+0xd0/0x670
      [   40.619519][  T146]  ? trace_hardirqs_on+0x30/0x180
      [   40.620117][  T146]  ? wait_for_completion+0x250/0x250
      [   40.620754][  T146]  bond_netdev_event+0x822/0x970 [bonding]
      [   40.621460][  T146]  ? __module_text_address+0x13/0x140
      [   40.622097][  T146]  notifier_call_chain+0x90/0x160
      [   40.622806][  T146]  rollback_registered_many+0x660/0xcf0
      [   40.623522][  T146]  ? netif_set_real_num_tx_queues+0x780/0x780
      [   40.624290][  T146]  ? notifier_call_chain+0x90/0x160
      [   40.624957][  T146]  ? netdev_upper_dev_unlink+0x114/0x180
      [   40.625686][  T146]  ? __netdev_adjacent_dev_unlink_neighbour+0x30/0x30
      [   40.626421][  T146]  ? mutex_is_locked+0x13/0x50
      [   40.627016][  T146]  ? unregister_netdevice_queue+0xf2/0x240
      [   40.627663][  T146]  unregister_netdevice_many.part.134+0x13/0x1b0
      [   40.628362][  T146]  default_device_exit_batch+0x2d9/0x390
      [   40.628987][  T146]  ? unregister_netdevice_many+0x40/0x40
      [   40.629615][  T146]  ? dev_change_net_namespace+0xcb0/0xcb0
      [   40.630279][  T146]  ? prepare_to_wait_exclusive+0x2e0/0x2e0
      [   40.630943][  T146]  ? ops_exit_list.isra.9+0x97/0x140
      [   40.631554][  T146]  cleanup_net+0x441/0x890
      [ ... ]
      
      Fixes: e289fd28 ("macvlan: fix the problem when mac address changes for passthru mode")
      Reported-by: syzbot+5035b1f9dc7ea4558d5a@syzkaller.appspotmail.com
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dee15b4
    • J
      e1000: remove unneeded conversion to bool · c95576a3
      Jason Yan 提交于
      The '==' expression itself is bool, no need to convert it to bool again.
      This fixes the following coccicheck warning:
      
      drivers/net/ethernet/intel/e1000/e1000_main.c:1479:44-49: WARNING:
      conversion to bool not needed here
      Signed-off-by: NJason Yan <yanaijie@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c95576a3
    • J
      i40e: Remove unneeded conversion to bool · 7ff4f063
      Jason Yan 提交于
      The '==' expression itself is bool, no need to convert it to bool again.
      This fixes the following coccicheck warning:
      
      drivers/net/ethernet/intel/i40e/i40e_main.c:1614:52-57: WARNING:
      conversion to bool not needed here
      drivers/net/ethernet/intel/i40e/i40e_main.c:11439:52-57: WARNING:
      conversion to bool not needed here
      Signed-off-by: NJason Yan <yanaijie@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ff4f063
    • J
      ptp: Remove unneeded conversion to bool · e9a9e519
      Jason Yan 提交于
      The '==' expression itself is bool, no need to convert it to bool again.
      This fixes the following coccicheck warning:
      
      drivers/ptp/ptp_ines.c:403:55-60: WARNING: conversion to bool not
      needed here
      drivers/ptp/ptp_ines.c:404:55-60: WARNING: conversion to bool not
      needed here
      Signed-off-by: NJason Yan <yanaijie@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9a9e519
    • J
      cgroup, netclassid: remove double cond_resched · 526f3d96
      Jiri Slaby 提交于
      Commit 018d26fc ("cgroup, netclassid: periodically release file_lock
      on classid") added a second cond_resched to write_classid indirectly by
      update_classid_task. Remove the one in write_classid.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Dmitry Yakunin <zeil@yandex-team.ru>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      526f3d96
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 76fc6a9a
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) flow_block_cb memleak in nf_flow_table_offload_del_cb(), from Roi Dayan.
      
      2) Fix error path handling in nf_nat_inet_register_fn(), from Hillf Danton.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76fc6a9a
  4. 21 4月, 2020 6 次提交
    • D
    • Z
      net/mlx5e: Get the latest values from counters in switchdev mode · dcdf4ce0
      Zhu Yanjun 提交于
      In the switchdev mode, when running "cat
      /sys/class/net/NIC/statistics/tx_packets", the ppcnt register is
      accessed to get the latest values. But currently this command can
      not get the correct values from ppcnt.
      
      From firmware manual, before getting the 802_3 counters, the 802_3
      data layout should be set to the ppcnt register.
      
      When the command "cat /sys/class/net/NIC/statistics/tx_packets" is
      run, before updating 802_3 data layout with ppcnt register, the
      monitor counters are tested. The test result will decide the
      802_3 data layout is updated or not.
      
      Actually the monitor counters do not support to monitor rx/tx
      stats of 802_3 in switchdev mode. So the rx/tx counters change
      will not trigger monitor counters. So the 802_3 data layout will
      not be updated in ppcnt register. Finally this command can not get
      the latest values from ppcnt register with 802_3 data layout.
      
      Fixes: 5c7e8bbb ("net/mlx5e: Use monitor counters for update stats")
      Signed-off-by: NZhu Yanjun <yanjunz@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      dcdf4ce0
    • S
      net/mlx5: Kconfig: convert imply usage to weak dependency · 96c34151
      Saeed Mahameed 提交于
      MLX5_CORE uses the 'imply' keyword to depend on VXLAN, PTP_1588_CLOCK,
      MLXFW and PCI_HYPERV_INTERFACE.
      
      This was useful to force vxlan, ptp, etc.. to be reachable to mlx5
      regardless of their config states.
      
      Due to the changes in the cited commit below, the semantics of 'imply'
      was changed to not force any restriction on the implied config.
      
      As a result of this change, the compilation of MLX5_CORE=y and VXLAN=m
      would result in undefined references, as VXLAN now would stay as 'm'.
      
      To fix this we change MLX5_CORE to have a weak dependency on
      these modules/configs and make sure they are reachable, by adding:
      depend on symbol || !symbol.
      
      For example: VXLAN=m MLX5_CORE=y, this will force MLX5_CORE to m
      
      Fixes: def2fbff ("kconfig: allow symbols implied by y to become m")
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nicolas Pitre <nico@fluxnic.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Reported-by: NRandy Dunlap <rdunlap@infradead.org>
      96c34151
    • M
      net/mlx5e: Don't trigger IRQ multiple times on XSK wakeup to avoid WQ overruns · e7e0004a
      Maxim Mikityanskiy 提交于
      XSK wakeup function triggers NAPI by posting a NOP WQE to a special XSK
      ICOSQ. When the application floods the driver with wakeup requests by
      calling sendto() in a certain pattern that ends up in mlx5e_trigger_irq,
      the XSK ICOSQ may overflow.
      
      Multiple NOPs are not required and won't accelerate the process, so
      avoid posting a second NOP if there is one already on the way. This way
      we also avoid increasing the queue size (which might not help anyway).
      
      Fixes: db05815b ("net/mlx5e: Add XSK zero-copy support")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      e7e0004a
    • P
      net/mlx5: CT: Change idr to xarray to protect parallel tuple id allocation · 70840b66
      Paul Blakey 提交于
      After allowing parallel tuple insertion, we get the following trace:
      
      [ 5505.142249] ------------[ cut here ]------------
      [ 5505.148155] WARNING: CPU: 21 PID: 13313 at lib/radix-tree.c:581 delete_node+0x16c/0x180
      [ 5505.295553] CPU: 21 PID: 13313 Comm: kworker/u50:22 Tainted: G           OE     5.6.0+ #78
      [ 5505.304824] Hardware name: Supermicro Super Server/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 5505.313740] Workqueue: nf_flow_table_offload flow_offload_work_handler [nf_flow_table]
      [ 5505.323257] RIP: 0010:delete_node+0x16c/0x180
      [ 5505.349862] RSP: 0018:ffffb19184eb7b30 EFLAGS: 00010282
      [ 5505.356785] RAX: 0000000000000000 RBX: ffff904ac95b86d8 RCX: ffff904b6f938838
      [ 5505.365190] RDX: 0000000000000000 RSI: ffff904ac954b908 RDI: ffff904ac954b920
      [ 5505.373628] RBP: ffff904b4ac13060 R08: 0000000000000001 R09: 0000000000000000
      [ 5505.382155] R10: 0000000000000000 R11: 0000000000000040 R12: 0000000000000000
      [ 5505.390527] R13: ffffb19184eb7bfc R14: ffff904b6bef5800 R15: ffff90482c1203c0
      [ 5505.399246] FS:  0000000000000000(0000) GS:ffff904c2fc80000(0000) knlGS:0000000000000000
      [ 5505.408621] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 5505.415739] CR2: 00007f5d27006010 CR3: 0000000058c10006 CR4: 00000000001626e0
      [ 5505.424547] Call Trace:
      [ 5505.428429]  idr_alloc_u32+0x7b/0xc0
      [ 5505.433803]  mlx5_tc_ct_entry_add_rule+0xbf/0x950 [mlx5_core]
      [ 5505.441354]  ? mlx5_fc_create+0x23c/0x370 [mlx5_core]
      [ 5505.448225]  mlx5_tc_ct_block_flow_offload+0x874/0x10b0 [mlx5_core]
      [ 5505.456278]  ? mlx5_tc_ct_block_flow_offload+0x63d/0x10b0 [mlx5_core]
      [ 5505.464532]  nf_flow_offload_tuple.isra.21+0xc5/0x140 [nf_flow_table]
      [ 5505.472286]  ? __kmalloc+0x217/0x2f0
      [ 5505.477093]  ? flow_rule_alloc+0x1c/0x30
      [ 5505.482117]  flow_offload_work_handler+0x1d0/0x290 [nf_flow_table]
      [ 5505.489674]  ? process_one_work+0x17c/0x580
      [ 5505.494922]  process_one_work+0x202/0x580
      [ 5505.500082]  ? process_one_work+0x17c/0x580
      [ 5505.505696]  worker_thread+0x4c/0x3f0
      [ 5505.510458]  kthread+0x103/0x140
      [ 5505.514989]  ? process_one_work+0x580/0x580
      [ 5505.520616]  ? kthread_bind+0x10/0x10
      [ 5505.525837]  ret_from_fork+0x3a/0x50
      [ 5505.570841] ---[ end trace 07995de9c56d6831 ]---
      
      This happens from parallel deletes/adds to idr, as idr isn't protected.
      Fix that by using xarray as the tuple_ids allocator instead of idr.
      
      Fixes: 7da182a9 ("netfilter: flowtable: Use work entry per offload command")
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NPaul Blakey <paulb@mellanox.com>
      Reviewed-by: NOz Shlomo <ozsh@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      70840b66
    • N
      net/mlx5: Fix failing fw tracer allocation on s390 · a019b361
      Niklas Schnelle 提交于
      On s390 FORCE_MAX_ZONEORDER is 9 instead of 11, thus a larger kzalloc()
      allocation as done for the firmware tracer will always fail.
      
      Looking at mlx5_fw_tracer_save_trace(), it is actually the driver itself
      that copies the debug data into the trace array and there is no need for
      the allocation to be contiguous in physical memory. We can therefor use
      kvzalloc() instead of kzalloc() and get rid of the large contiguous
      allcoation.
      
      Fixes: f53aaa31 ("net/mlx5: FW tracer, implement tracer logic")
      Signed-off-by: NNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      a019b361