1. 31 3月, 2018 7 次提交
    • D
      net/ipv6: Fix route leaking between VRFs · b6cdbc85
      David Ahern 提交于
      Donald reported that IPv6 route leaking between VRFs is not working.
      The root cause is the strict argument in the call to rt6_lookup when
      validating the nexthop spec.
      
      ip6_route_check_nh validates the gateway and device (if given) of a
      route spec. It in turn could call rt6_lookup (e.g., lookup in a given
      table did not succeed so it falls back to a full lookup) and if so
      sets the strict argument to 1. That means if the egress device is given,
      the route lookup needs to return a result with the same device. This
      strict requirement does not work with VRFs (IPv4 or IPv6) because the
      oif in the flow struct is overridden with the index of the VRF device
      to trigger a match on the l3mdev rule and force the lookup to its table.
      
      The right long term solution is to add an l3mdev index to the flow
      struct such that the oif is not overridden. That solution will not
      backport well, so this patch aims for a simpler solution to relax the
      strict argument if the route spec device is an l3mdev slave. As done
      in other places, use the FLOWI_FLAG_SKIP_NH_OIF to know that the
      RT6_LOOKUP_F_IFACE flag needs to be removed.
      
      Fixes: ca254490 ("net: Add VRF support to IPv6 stack")
      Reported-by: NDonald Sharp <sharpd@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6cdbc85
    • D
      vrf: Fix use after free and double free in vrf_finish_output · 82dd0d2a
      David Ahern 提交于
      Miguel reported an skb use after free / double free in vrf_finish_output
      when neigh_output returns an error. The vrf driver should return after
      the call to neigh_output as it takes over the skb on error path as well.
      
      Patch is a simplified version of Miguel's patch which was written for 4.9,
      and updated to top of tree.
      
      Fixes: 8f58336d ("net: Add ethernet header for pass through VRF device")
      Signed-off-by: NMiguel Fadon Perlines <mfadon@teldat.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82dd0d2a
    • D
      ipv6: sr: fix seg6 encap performances with TSO enabled · 5807b22c
      David Lebrun 提交于
      Enabling TSO can lead to abysmal performances when using seg6 in
      encap mode, such as with the ixgbe driver. This patch adds a call to
      iptunnel_handle_offloads() to remove the encapsulation bit if needed.
      
      Before:
      root@comp4-seg6bpf:~# iperf3 -c fc00::55
      Connecting to host fc00::55, port 5201
      [  4] local fc45::4 port 36592 connected to fc00::55 port 5201
      [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
      [  4]   0.00-1.00   sec   196 KBytes  1.60 Mbits/sec   47   6.66 KBytes
      [  4]   1.00-2.00   sec   304 KBytes  2.49 Mbits/sec  100   5.33 KBytes
      [  4]   2.00-3.00   sec   284 KBytes  2.32 Mbits/sec   92   5.33 KBytes
      
      After:
      root@comp4-seg6bpf:~# iperf3 -c fc00::55
      Connecting to host fc00::55, port 5201
      [  4] local fc45::4 port 43062 connected to fc00::55 port 5201
      [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
      [  4]   0.00-1.00   sec  1.03 GBytes  8.89 Gbits/sec    0    743 KBytes
      [  4]   1.00-2.00   sec  1.03 GBytes  8.87 Gbits/sec    0    743 KBytes
      [  4]   2.00-3.00   sec  1.03 GBytes  8.87 Gbits/sec    0    743 KBytes
      Reported-by: NTom Herbert <tom@quantonium.net>
      Fixes: 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
      Signed-off-by: NDavid Lebrun <dlebrun@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5807b22c
    • T
      net/dim: Fix int overflow · f97c3dc3
      Tal Gilboa 提交于
      When calculating difference between samples, the values
      are multiplied by 100. Large values may cause int overflow
      when multiplied (usually on first iteration).
      Fixed by forcing 100 to be of type unsigned long.
      
      Fixes: 4c4dbb4a ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
      Signed-off-by: NTal Gilboa <talgi@mellanox.com>
      Reviewed-by: NAndy Gospodarek <gospo@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f97c3dc3
    • D
      Merge branch 'vlan-fix' · 52a9692a
      David S. Miller 提交于
      Toshiaki Makita says:
      
      ====================
      Fix vlan tag handling for vlan packets without ethernet headers
      
      Eric Dumazet reported syzbot found a new bug which leads to underflow of
      size argument of memmove(), causing crash[1]. This can be triggered by tun
      devices.
      
      The underflow happened because skb_vlan_untag() did not expect vlan packets
      without ethernet headers, and tun can produce such packets.
      I also checked vlan_insert_inner_tag() and found a similar bug.
      
      This series fixes these problems.
      
      [1] https://marc.info/?l=linux-netdev&m=152221753920510&w=2
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52a9692a
    • T
      vlan: Fix vlan insertion for packets without ethernet header · c769accd
      Toshiaki Makita 提交于
      In some situation vlan packets do not have ethernet headers. One example
      is packets from tun devices. Users can specify vlan protocol in tun_pi
      field instead of IP protocol. When we have a vlan device with reorder_hdr
      disabled on top of the tun device, such packets from tun devices are
      untagged in skb_vlan_untag() and vlan headers will be inserted back in
      vlan_insert_inner_tag().
      
      vlan_insert_inner_tag() however did not expect packets without ethernet
      headers, so in such a case size argument for memmove() underflowed.
      
      We don't need to copy headers for packets which do not have preceding
      headers of vlan headers, so skip memmove() in that case.
      Also don't write vlan protocol in skb->data when it does not have enough
      room for it.
      
      Fixes: cbe7128c ("vlan: Fix out of order vlan headers with reorder header off")
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c769accd
    • T
      net: Fix untag for vlan packets without ethernet header · ae474573
      Toshiaki Makita 提交于
      In some situation vlan packets do not have ethernet headers. One example
      is packets from tun devices. Users can specify vlan protocol in tun_pi
      field instead of IP protocol, and skb_vlan_untag() attempts to untag such
      packets.
      
      skb_vlan_untag() (more precisely, skb_reorder_vlan_header() called by it)
      however did not expect packets without ethernet headers, so in such a case
      size argument for memmove() underflowed and triggered crash.
      
      ====
      BUG: unable to handle kernel paging request at ffff8801cccb8000
      IP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
      PGD 9cee067 P4D 9cee067 PUD 1d9401063 PMD 1cccb7063 PTE 2810100028101
      Oops: 000b [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 17663 Comm: syz-executor2 Not tainted 4.16.0-rc7+ #368
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
      RSP: 0018:ffff8801cc046e28 EFLAGS: 00010287
      RAX: ffff8801ccc244c4 RBX: fffffffffffffffe RCX: fffffffffff6c4c2
      RDX: fffffffffffffffe RSI: ffff8801cccb7ffc RDI: ffff8801cccb8000
      RBP: ffff8801cc046e48 R08: ffff8801ccc244be R09: ffffed0039984899
      R10: 0000000000000001 R11: ffffed0039984898 R12: ffff8801ccc244c4
      R13: ffff8801ccc244c0 R14: ffff8801d96b7c06 R15: ffff8801d96b7b40
      FS:  00007febd562d700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffff8801cccb8000 CR3: 00000001ccb2f006 CR4: 00000000001606e0
      DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
      Call Trace:
       memmove include/linux/string.h:360 [inline]
       skb_reorder_vlan_header net/core/skbuff.c:5031 [inline]
       skb_vlan_untag+0x470/0xc40 net/core/skbuff.c:5061
       __netif_receive_skb_core+0x119c/0x3460 net/core/dev.c:4460
       __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4627
       netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4701
       netif_receive_skb+0xae/0x390 net/core/dev.c:4725
       tun_rx_batched.isra.50+0x5ee/0x870 drivers/net/tun.c:1555
       tun_get_user+0x299e/0x3c20 drivers/net/tun.c:1962
       tun_chr_write_iter+0xb9/0x160 drivers/net/tun.c:1990
       call_write_iter include/linux/fs.h:1782 [inline]
       new_sync_write fs/read_write.c:469 [inline]
       __vfs_write+0x684/0x970 fs/read_write.c:482
       vfs_write+0x189/0x510 fs/read_write.c:544
       SYSC_write fs/read_write.c:589 [inline]
       SyS_write+0xef/0x220 fs/read_write.c:581
       do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7
      RIP: 0033:0x454879
      RSP: 002b:00007febd562cc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00007febd562d6d4 RCX: 0000000000454879
      RDX: 0000000000000157 RSI: 0000000020000180 RDI: 0000000000000014
      RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000006b0 R14: 00000000006fc120 R15: 0000000000000000
      Code: 90 90 90 90 90 90 90 48 89 f8 48 83 fa 20 0f 82 03 01 00 00 48 39 fe 7d 0f 49 89 f0 49 01 d0 49 39 f8 0f 8f 9f 00 00 00 48 89 d1 <f3> a4 c3 48 81 fa a8 02 00 00 72 05 40 38 fe 74 3b 48 83 ea 20
      RIP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP: ffff8801cc046e28
      CR2: ffff8801cccb8000
      ====
      
      We don't need to copy headers for packets which do not have preceding
      headers of vlan headers, so skip memmove() in that case.
      
      Fixes: 4bbb3e0e ("net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off")
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae474573
  2. 30 3月, 2018 4 次提交
  3. 29 3月, 2018 4 次提交
  4. 28 3月, 2018 5 次提交
  5. 27 3月, 2018 20 次提交
    • U
      net/smc: use announced length in sock_recvmsg() · ab6f6dd1
      Ursula Braun 提交于
      Not every CLC proposal message needs the maximum buffer length.
      Due to the MSG_WAITALL flag, it is important to use the peeked
      real length when receiving the message.
      
      Fixes: d63d271c ("smc: switch to sock_recvmsg()")
      Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab6f6dd1
    • C
      llc: properly handle dev_queue_xmit() return value · b85ab56c
      Cong Wang 提交于
      llc_conn_send_pdu() pushes the skb into write queue and
      calls llc_conn_send_pdus() to flush them out. However, the
      status of dev_queue_xmit() is not returned to caller,
      in this case, llc_conn_state_process().
      
      llc_conn_state_process() needs hold the skb no matter
      success or failure, because it still uses it after that,
      therefore we should hold skb before dev_queue_xmit() when
      that skb is the one being processed by llc_conn_state_process().
      
      For other callers, they can just pass NULL and ignore
      the return value as they are.
      Reported-by: NNoam Rathaus <noamr@beyondsecurity.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b85ab56c
    • D
      Merge tag 'mlx5-fixes-2018-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 2a7fdec9
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2018-03-23
      
      The following series includes fixes for mlx5 netdev and eswitch.
      
      v1->v2:
          - Fixed commit message quotation marks in patch #7
      
      For -stable v4.12
          ('net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path')
          ('net/mlx5e: Fix traffic being dropped on VF representor')
      
      For -stable v4.13
          ('net/mlx5e: Fix memory usage issues in offloading TC flows')
          ('net/mlx5e: Verify coalescing parameters in range')
      
      For -stable v4.14
          ('net/mlx5e: Don't override vport admin link state in switchdev mode')
      
      For -stable v4.15
          ('108b2b6d5c02 net/mlx5e: Sync netdev vxlan ports at open')
      
      Please pull and let me know if there's any problem.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a7fdec9
    • D
      strparser: Fix sign of err codes · cd00edc1
      Dave Watson 提交于
      strp_parser_err is called with a negative code everywhere, which then
      calls abort_parser with a negative code.  strp_msg_timeout calls
      abort_parser directly with a positive code.  Negate ETIMEDOUT
      to match signed-ness of other calls.
      
      The default abort_parser callback, strp_abort_strp, sets
      sk->sk_err to err.  Also negate the error here so sk_err always
      holds a positive value, as the rest of the net code expects.  Currently
      a negative sk_err can result in endless loops, or user code that
      thinks it actually sent/received err bytes.
      
      Found while testing net/tls_sw recv path.
      
      Fixes: 43a0c675 ("strparser: Stream parser for messages")
      Signed-off-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd00edc1
    • C
      net sched actions: fix dumping which requires several messages to user space · 734549eb
      Craig Dillabaugh 提交于
      Fixes a bug in the tcf_dump_walker function that can cause some actions
      to not be reported when dumping a large number of actions. This issue
      became more aggrevated when cookies feature was added. In particular
      this issue is manifest when large cookie values are assigned to the
      actions and when enough actions are created that the resulting table
      must be dumped in multiple batches.
      
      The number of actions returned in each batch is limited by the total
      number of actions and the memory buffer size.  With small cookies
      the numeric limit is reached before the buffer size limit, which avoids
      the code path triggering this bug. When large cookies are used buffer
      fills before the numeric limit, and the erroneous code path is hit.
      
      For example after creating 32 csum actions with the cookie
      aaaabbbbccccdddd
      
      $ tc actions ls action csum
      total acts 26
      
          action order 0: csum (tcp) action continue
          index 1 ref 1 bind 0
          cookie aaaabbbbccccdddd
      
          .....
      
          action order 25: csum (tcp) action continue
          index 26 ref 1 bind 0
          cookie aaaabbbbccccdddd
      total acts 6
      
          action order 0: csum (tcp) action continue
          index 28 ref 1 bind 0
          cookie aaaabbbbccccdddd
      
          ......
      
          action order 5: csum (tcp) action continue
          index 32 ref 1 bind 0
          cookie aaaabbbbccccdddd
      
      Note that the action with index 27 is omitted from the report.
      
      Fixes: 4b3550ef ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")"
      Signed-off-by: NCraig Dillabaugh <cdillaba@mojatatu.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      734549eb
    • H
      r8169: fix setting driver_data after register_netdev · 19c9ea36
      Heiner Kallweit 提交于
      pci_set_drvdata() is called only after registering the net_device,
      therefore we could run into a NPE if one of the functions using
      driver_data is called before it's set.
      
      Fix this by calling pci_set_drvdata() before registering the
      net_device.
      
      This fix is a candidate for stable. As far as I can see the
      bug has been there in kernel version 3.2 already, therefore
      I can't provide a reference which commit is fixed by it.
      
      The fix may need small adjustments per kernel version because
      due to other changes the label which is jumped to if
      register_netdev() fails has changed over time.
      Reported-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19c9ea36
    • E
      net: fix possible out-of-bound read in skb_network_protocol() · 1dfe82eb
      Eric Dumazet 提交于
      skb mac header is not necessarily set at the time skb_network_protocol()
      is called. Use skb->data instead.
      
      BUG: KASAN: slab-out-of-bounds in skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739
      Read of size 2 at addr ffff8801b3097a0b by task syz-executor5/14242
      
      CPU: 1 PID: 14242 Comm: syz-executor5 Not tainted 4.16.0-rc6+ #280
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x24d lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report+0x23c/0x360 mm/kasan/report.c:412
       __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:443
       skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739
       harmonize_features net/core/dev.c:2924 [inline]
       netif_skb_features+0x509/0x9b0 net/core/dev.c:3011
       validate_xmit_skb+0x81/0xb00 net/core/dev.c:3084
       validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3142
       packet_direct_xmit+0x117/0x790 net/packet/af_packet.c:256
       packet_snd net/packet/af_packet.c:2944 [inline]
       packet_sendmsg+0x3aed/0x60b0 net/packet/af_packet.c:2969
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:639
       ___sys_sendmsg+0x767/0x8b0 net/socket.c:2047
       __sys_sendmsg+0xe5/0x210 net/socket.c:2081
      
      Fixes: 19acc327 ("gso: Handle Trans-Ether-Bridging protocol in skb_network_protocol()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Reported-by: NReported-by: syzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dfe82eb
    • G
      net-usb: add qmi_wwan if on lte modem wistron neweb d18q1 · d4c4bc11
      Giuseppe Lippolis 提交于
      This modem is embedded on dlink dwr-921 router.
          The oem configuration states:
      
          T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
          D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
          P:  Vendor=1435 ProdID=0918 Rev= 2.32
          S:  Manufacturer=Android
          S:  Product=Android
          S:  SerialNumber=0123456789ABCDEF
          C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA
          I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
          E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
          E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
          E:  Ad=84(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
          E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
          E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
          E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
          E:  Ad=88(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
          E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
          E:  Ad=8a(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
          E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          I:* If#= 6 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=(none)
          E:  Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
          E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=125us
      
      Tested on openwrt distribution
      Signed-off-by: NGiuseppe Lippolis <giu.lippolis@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4c4bc11
    • D
      Merge tag 'batadv-net-for-davem-20180326' of git://git.open-mesh.org/linux-merge · d7785b59
      David S. Miller 提交于
      Simon Wunderlich says:
      
      ====================
      Here are some batman-adv bugfixes:
      
       - fix multicast-via-unicast transmissions for AP isolation and gateway
         extension, by Linus Luessing (2 patches)
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7785b59
    • S
      net/mlx5e: Sync netdev vxlan ports at open · a117f73d
      Shahar Klein 提交于
      When mlx5_core is loaded it is expected to sync ports
      with all vxlan devices so it can support vxlan encap/decap.
      This is done via udp_tunnel_get_rx_info(). Currently this
      call is set in mlx5e_nic_enable() and if the netdev is not in
      NETREG_REGISTERED state it will not be called.
      
      Normally on load the netdev state is not NETREG_REGISTERED
      so udp_tunnel_get_rx_info() will not be called.
      
      Moving udp_tunnel_get_rx_info() to mlx5e_open() so
      it will be called on netdev UP event and allow encap/decap.
      
      Fixes: 610e89e0 ("net/mlx5e: Don't sync netdev state when not registered")
      Signed-off-by: NShahar Klein <shahark@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      a117f73d
    • O
      net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path · 423c9db2
      Or Gerlitz 提交于
      Currently we use the global ipv6_stub var to access the ipv6 global
      nd table. This practice gets us to troubles when the stub is only partially
      set e.g when ipv6 is loaded under the disabled policy. In this case, as of commit
      343d60aa ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument")
      the stub is not null, but stub->nd_tbl is and we crash.
      
      As we can access the ipv6 nd_tbl directly, the fix is just to avoid the
      reference through the stub. There is one place in the code where we
      issue ipv6 route lookup and keep doing it through the stub, but that
      mentioned commit makes sure we get -EAFNOSUPPORT from the stack.
      
      Fixes: 232c0013 ("net/mlx5e: Add support to neighbour update flow")
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NAviv Heller <avivh@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      423c9db2
    • J
      net/mlx5e: Fix memory usage issues in offloading TC flows · af1607c3
      Jianbo Liu 提交于
      For NIC flows, the parsed attributes are not freed when we exit
      successfully from mlx5e_configure_flower().
      
      There is possible double free for eswitch flows. If error is returned
      from rhashtable_insert_fast(), the parse attrs will be freed in
      mlx5e_tc_del_flow(), but they will be freed again before exiting
      mlx5e_configure_flower().
      
      To fix both issues we do the following:
      (1) change the condition that determines if to issue the free call to
          check if this flow is NIC flow, or it does not have encap action.
      (2) reorder the code such that that the check and free calls are done
          before we attempt to add into the hash table.
      
      Fixes: 232c0013 ('net/mlx5e: Add support to neighbour update flow')
      Signed-off-by: NJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      af1607c3
    • R
      net/mlx5e: Fix traffic being dropped on VF representor · 4246f698
      Roi Dayan 提交于
      Increase representor netdev RQ size to avoid dropped packets.
      The current size (two) is just too small to keep up with
      conventional slow path traffic patterns.
      Also match the SQ size to the RQ size.
      
      Fixes: cb67b832 ("net/mlx5e: Introduce SRIOV VF representors")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NPaul Blakey <paulb@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4246f698
    • M
      net/mlx5e: Verify coalescing parameters in range · b392a207
      Moshe Shemesh 提交于
      Add check of coalescing parameters received through ethtool are within
      range of values supported by the HW.
      Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the
      users through ethtool. The ethtool support up to 32 bit value for each.
      However, mlx5 modify cq limits the coalescing time parameter to 12 bit
      and coalescing frames parameters to 16 bits.
      Return out of range error if user tries to set these parameters to
      higher values.
      
      Fixes: f62b8bb8 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      b392a207
    • O
      net/mlx5: Make eswitch support to depend on switchdev · f125376b
      Or Gerlitz 提交于
      Add dependancy for switchdev to be congfigured as any user-space control
      plane SW is expected to use the HW switchdev ID to locate the representors
      related to VFs of a certain PF and apply SW/offloaded switching on them.
      
      Fixes: e80541ec ('net/mlx5: Add CONFIG_MLX5_ESWITCH Kconfig')
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f125376b
    • O
      net/mlx5e: Use 32 bits to store VF representor SQ number · 5ecadff0
      Or Gerlitz 提交于
      SQs are 32 and not 16 bits, hence it's wrong to use only 16 bits to
      store the sq number for which are going to set steering rule, fix that.
      
      Fixes: cb67b832 ('net/mlx5e: Introduce SRIOV VF representors')
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      5ecadff0
    • J
      net/mlx5e: Don't override vport admin link state in switchdev mode · 84c9c8f2
      Jianbo Liu 提交于
      The vport admin original link state will be re-applied after returning
      back to legacy mode, it is not right to change the admin link state value
      when in switchdev mode.
      
      Use direct vport commands to alter logical vport state in netdev
      representor open/close flows rather than the administrative eswitch API.
      
      Fixes: 20a1ea67 ('net/mlx5e: Support VF vport link state control for SRIOV switchdev mode')
      Signed-off-by: NJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      84c9c8f2
    • S
      net: dsa: mt7530: fix module autoloading for OF platform drivers · 3c82b372
      Sean Wang 提交于
      It's required to create a modules.alias via MODULE_DEVICE_TABLE helper
      for the OF platform driver. Otherwise, module autoloading cannot work.
      Signed-off-by: NSean Wang <sean.wang@mediatek.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c82b372
    • S
      net: dsa: mt7530: remove redundant MODULE_ALIAS entries · 1c82c9e1
      Sean Wang 提交于
      MODULE_ALIAS exports information to allow the module to be auto-loaded at
      boot for the drivers registered using legacy platform registration.
      
      However, currently the driver is always used by DT-only platform,
      MODULE_ALIAS is redundant and should be removed properly.
      Signed-off-by: NSean Wang <sean.wang@mediatek.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c82c9e1
    • J
      vhost_net: add missing lock nesting notation · aaa3149b
      Jason Wang 提交于
      We try to hold TX virtqueue mutex in vhost_net_rx_peek_head_len()
      after RX virtqueue mutex is held in handle_rx(). This requires an
      appropriate lock nesting notation to calm down deadlock detector.
      
      Fixes: 03088137 ("vhost_net: basic polling support")
      Reported-by: syzbot+7f073540b1384a614e09@syzkaller.appspotmail.com
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aaa3149b