1. 01 11月, 2019 3 次提交
    • I
      netdevsim: Fix use-after-free during device dismantle · 6d6f0383
      Ido Schimmel 提交于
      Commit da58f90f ("netdevsim: Add devlink-trap support") added
      delayed work to netdevsim that periodically iterates over the registered
      netdevsim ports and reports various packet traps via devlink.
      
      While the delayed work takes the 'port_list_lock' mutex to protect
      against concurrent addition / deletion of ports, during device creation
      / dismantle ports are added / deleted without this lock, which can
      result in a use-after-free [1].
      
      Fix this by making sure that the ports list is always modified under the
      lock.
      
      [1]
      [   59.205543] ==================================================================
      [   59.207748] BUG: KASAN: use-after-free in nsim_dev_trap_report_work+0xa67/0xad0
      [   59.210247] Read of size 8 at addr ffff8883cbdd3398 by task kworker/3:1/38
      [   59.212584]
      [   59.213148] CPU: 3 PID: 38 Comm: kworker/3:1 Not tainted 5.4.0-rc3-custom-16119-ge6abb5f0261e #2013
      [   59.215896] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
      [   59.218384] Workqueue: events nsim_dev_trap_report_work
      [   59.219428] Call Trace:
      [   59.219924]  dump_stack+0xa9/0x10e
      [   59.220623]  print_address_description.constprop.4+0x21/0x340
      [   59.221976]  ? vprintk_func+0x66/0x240
      [   59.222752]  __kasan_report.cold.8+0x78/0x91
      [   59.223602]  ? nsim_dev_trap_report_work+0xa67/0xad0
      [   59.224603]  kasan_report+0xe/0x20
      [   59.225296]  nsim_dev_trap_report_work+0xa67/0xad0
      [   59.226435]  ? rcu_read_lock_sched_held+0xaf/0xe0
      [   59.227512]  ? trace_event_raw_event_rcu_quiescent_state_report+0x360/0x360
      [   59.228851]  process_one_work+0x98f/0x1760
      [   59.229684]  ? pwq_dec_nr_in_flight+0x330/0x330
      [   59.230656]  worker_thread+0x91/0xc40
      [   59.231587]  ? process_one_work+0x1760/0x1760
      [   59.232451]  kthread+0x34a/0x410
      [   59.233104]  ? __kthread_queue_delayed_work+0x240/0x240
      [   59.234141]  ret_from_fork+0x3a/0x50
      [   59.234982]
      [   59.235371] Allocated by task 187:
      [   59.236189]  save_stack+0x19/0x80
      [   59.236853]  __kasan_kmalloc.constprop.5+0xc1/0xd0
      [   59.237822]  kmem_cache_alloc_trace+0x14c/0x380
      [   59.238769]  __nsim_dev_port_add+0xaf/0x5c0
      [   59.239627]  nsim_dev_probe+0x4fc/0x1140
      [   59.240550]  really_probe+0x264/0xc00
      [   59.241418]  driver_probe_device+0x208/0x2e0
      [   59.242255]  __device_attach_driver+0x215/0x2d0
      [   59.243150]  bus_for_each_drv+0x154/0x1d0
      [   59.243944]  __device_attach+0x1ba/0x2b0
      [   59.244923]  bus_probe_device+0x1dd/0x290
      [   59.245805]  device_add+0xbac/0x1550
      [   59.246528]  new_device_store+0x1f4/0x400
      [   59.247306]  bus_attr_store+0x7b/0xa0
      [   59.248047]  sysfs_kf_write+0x10f/0x170
      [   59.248941]  kernfs_fop_write+0x283/0x430
      [   59.249843]  __vfs_write+0x81/0x100
      [   59.250546]  vfs_write+0x1ce/0x510
      [   59.251190]  ksys_write+0x104/0x200
      [   59.251873]  do_syscall_64+0xa4/0x4e0
      [   59.252642]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [   59.253837]
      [   59.254203] Freed by task 187:
      [   59.254811]  save_stack+0x19/0x80
      [   59.255463]  __kasan_slab_free+0x125/0x170
      [   59.256265]  kfree+0x100/0x440
      [   59.256870]  nsim_dev_remove+0x98/0x100
      [   59.257651]  nsim_bus_remove+0x16/0x20
      [   59.258382]  device_release_driver_internal+0x20b/0x4d0
      [   59.259588]  bus_remove_device+0x2e9/0x5a0
      [   59.260551]  device_del+0x410/0xad0
      [   59.263777]  device_unregister+0x26/0xc0
      [   59.264616]  nsim_bus_dev_del+0x16/0x60
      [   59.265381]  del_device_store+0x2d6/0x3c0
      [   59.266295]  bus_attr_store+0x7b/0xa0
      [   59.267192]  sysfs_kf_write+0x10f/0x170
      [   59.267960]  kernfs_fop_write+0x283/0x430
      [   59.268800]  __vfs_write+0x81/0x100
      [   59.269551]  vfs_write+0x1ce/0x510
      [   59.270252]  ksys_write+0x104/0x200
      [   59.270910]  do_syscall_64+0xa4/0x4e0
      [   59.271680]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [   59.272812]
      [   59.273211] The buggy address belongs to the object at ffff8883cbdd3200
      [   59.273211]  which belongs to the cache kmalloc-512 of size 512
      [   59.275838] The buggy address is located 408 bytes inside of
      [   59.275838]  512-byte region [ffff8883cbdd3200, ffff8883cbdd3400)
      [   59.278151] The buggy address belongs to the page:
      [   59.279215] page:ffffea000f2f7400 refcount:1 mapcount:0 mapping:ffff8883ecc0ce00 index:0x0 compound_mapcount: 0
      [   59.281449] flags: 0x200000000010200(slab|head)
      [   59.282356] raw: 0200000000010200 ffffea000f2f3a08 ffffea000f2fd608 ffff8883ecc0ce00
      [   59.283949] raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
      [   59.285608] page dumped because: kasan: bad access detected
      [   59.286981]
      [   59.287337] Memory state around the buggy address:
      [   59.288310]  ffff8883cbdd3280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   59.289763]  ffff8883cbdd3300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   59.291452] >ffff8883cbdd3380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   59.292945]                             ^
      [   59.293815]  ffff8883cbdd3400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   59.295220]  ffff8883cbdd3480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   59.296872] ==================================================================
      
      Fixes: da58f90f ("netdevsim: Add devlink-trap support")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reported-by: syzbot+9ed8f68ab30761f3678e@syzkaller.appspotmail.com
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d6f0383
    • D
      rxrpc: Fix handling of last subpacket of jumbo packet · f9c32435
      David Howells 提交于
      When rxrpc_recvmsg_data() sets the return value to 1 because it's drained
      all the data for the last packet, it checks the last-packet flag on the
      whole packet - but this is wrong, since the last-packet flag is only set on
      the final subpacket of the last jumbo packet.  This means that a call that
      receives its last packet in a jumbo packet won't complete properly.
      
      Fix this by having rxrpc_locate_data() determine the last-packet state of
      the subpacket it's looking at and passing that back to the caller rather
      than having the caller look in the packet header.  The caller then needs to
      cache this in the rxrpc_call struct as rxrpc_locate_data() isn't then
      called again for this packet.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Fixes: e2de6c40 ("rxrpc: Use info in skbuff instead of reparsing a jumbo packet")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9c32435
    • D
      Merge tag 'mac80211-for-net-2019-10-31' of... · 5a7ec667
      David S. Miller 提交于
      Merge tag 'mac80211-for-net-2019-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      Just two fixes:
       * HT operation is not allowed on channel 14 (Japan only)
       * netlink policy for nexthop attribute was wrong
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a7ec667
  2. 31 10月, 2019 10 次提交
    • D
      Merge branch 'hv_netvsc-fix-error-handling-in-netvsc_attach-set_features' · 3da09663
      David S. Miller 提交于
      Haiyang Zhang says:
      
      ====================
      hv_netvsc: fix error handling in netvsc_attach/set_features
      
      The error handling code path in these functions are not correct.
      This patch set fixes them.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3da09663
    • H
      hv_netvsc: Fix error handling in netvsc_attach() · 719b85c3
      Haiyang Zhang 提交于
      If rndis_filter_open() fails, we need to remove the rndis device created
      in earlier steps, before returning an error code. Otherwise, the retry of
      netvsc_attach() from its callers will fail and hang.
      
      Fixes: 7b2ee50c ("hv_netvsc: common detach logic")
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      719b85c3
    • H
      hv_netvsc: Fix error handling in netvsc_set_features() · c4509a5a
      Haiyang Zhang 提交于
      When an error is returned by rndis_filter_set_offload_params(), we should
      still assign the unaffected features to ndev->features. Otherwise, these
      features will be missing.
      
      Fixes: d6792a5a ("hv_netvsc: Add handler for LRO setting change")
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4509a5a
    • V
      cxgb4: fix panic when attaching to ULD fail · fc89cc35
      Vishal Kulkarni 提交于
      Release resources when attaching to ULD fail. Otherwise, data
      mismatch is seen between LLD and ULD later on, which lead to
      kernel panic when accessing resources that should not even
      exist in the first place.
      
      Fixes: 94cdb8bb ("cxgb4: Add support for dynamic allocation of resources for ULD")
      Signed-off-by: NShahjada Abul Husain <shahjada@chelsio.com>
      Signed-off-by: NVishal Kulkarni <vishal@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc89cc35
    • E
      net: annotate lockless accesses to sk->sk_napi_id · ee8d153d
      Eric Dumazet 提交于
      We already annotated most accesses to sk->sk_napi_id
      
      We missed sk_mark_napi_id() and sk_mark_napi_id_once()
      which might be called without socket lock held in UDP stack.
      
      KCSAN reported :
      BUG: KCSAN: data-race in udpv6_queue_rcv_one_skb / udpv6_queue_rcv_one_skb
      
      write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 0:
       sk_mark_napi_id include/net/busy_poll.h:125 [inline]
       __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline]
       udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672
       udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689
       udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832
       __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913
       udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015
       ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409
       ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
      
      write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 1:
       sk_mark_napi_id include/net/busy_poll.h:125 [inline]
       __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline]
       udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672
       udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689
       udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832
       __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913
       udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015
       ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409
       ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 10890 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: e68b6e50 ("udp: enable busy polling for all sockets")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee8d153d
    • E
      net: annotate accesses to sk->sk_incoming_cpu · 7170a977
      Eric Dumazet 提交于
      This socket field can be read and written by concurrent cpus.
      
      Use READ_ONCE() and WRITE_ONCE() annotations to document this,
      and avoid some compiler 'optimizations'.
      
      KCSAN reported :
      
      BUG: KCSAN: data-race in tcp_v4_rcv / tcp_v4_rcv
      
      write to 0xffff88812220763c of 4 bytes by interrupt on cpu 0:
       sk_incoming_cpu_update include/net/sock.h:953 [inline]
       tcp_v4_rcv+0x1b3c/0x1bb0 net/ipv4/tcp_ipv4.c:1934
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
       do_softirq.part.0+0x6b/0x80 kernel/softirq.c:337
       do_softirq kernel/softirq.c:329 [inline]
       __local_bh_enable_ip+0x76/0x80 kernel/softirq.c:189
      
      read to 0xffff88812220763c of 4 bytes by interrupt on cpu 1:
       sk_incoming_cpu_update include/net/sock.h:952 [inline]
       tcp_v4_rcv+0x181a/0x1bb0 net/ipv4/tcp_ipv4.c:1934
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       run_ksoftirqd+0x46/0x60 kernel/softirq.c:603
       smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7170a977
    • J
      mlxsw: core: Unpublish devlink parameters during reload · b7265a0d
      Jiri Pirko 提交于
      The devlink parameter "acl_region_rehash_interval" is a runtime
      parameter whose value is stored in a dynamically allocated memory. While
      reloading the driver, this memory is freed and then allocated again. A
      use-after-free might happen if during this time frame someone tries to
      retrieve its value.
      
      Since commit 070c63f2 ("net: devlink: allow to change namespaces
      during reload") the use-after-free can be reliably triggered when
      reloading the driver into a namespace, as after freeing the memory (via
      reload_down() callback) all the parameters are notified.
      
      Fix this by unpublishing and then re-publishing the parameters during
      reload.
      
      Fixes: 98bbf70c ("mlxsw: spectrum: add "acl_region_rehash_interval" devlink param")
      Fixes: 7c62cfb8 ("devlink: publish params only after driver init is done")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7265a0d
    • S
      qed: Optimize execution time for nvm attributes configuration. · c63b0968
      Sudarsana Reddy Kalluru 提交于
      Current implementation for nvm_attr configuration instructs the management
      FW to load/unload the nvm-cfg image for each user-provided attribute in
      the input file. This consumes lot of cycles even for few tens of
      attributes.
      This patch updates the implementation to perform load/commit of the config
      for every 50 attributes. After loading the nvm-image, MFW expects that
      config should be committed in a predefined timer value (5 sec), hence it's
      not possible to write large number of attributes in a single load/commit
      window. Hence performing the commits in chunks.
      
      Fixes: 0dabbe1b ("qed: Add driver API for flashing the config attributes.")
      Signed-off-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c63b0968
    • T
      vxlan: fix unexpected failure of vxlan_changelink() · c6761cf5
      Taehee Yoo 提交于
      After commit 0ce1822c ("vxlan: add adjacent link to limit depth
      level"), vxlan_changelink() could fail because of
      netdev_adjacent_change_prepare().
      netdev_adjacent_change_prepare() returns -EEXIST when old lower device
      and new lower device are same.
      (old lower device is "dst->remote_dev" and new lower device is "lowerdev")
      So, before calling it, lowerdev should be NULL if these devices are same.
      
      Test command1:
          ip link add dummy0 type dummy
          ip link add vxlan0 type vxlan dev dummy0 dstport 4789 vni 1
          ip link set vxlan0 type vxlan ttl 5
          RTNETLINK answers: File exists
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 0ce1822c ("vxlan: add adjacent link to limit depth level")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6761cf5
    • C
      qed: fix spelling mistake "queuess" -> "queues" · dc99da4f
      Colin Ian King 提交于
      There is a spelling misake in a DP_NOTICE message. Fix it.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc99da4f
  3. 30 10月, 2019 27 次提交
    • M
      nl80211: fix validation of mesh path nexthop · 1fab1b89
      Markus Theil 提交于
      Mesh path nexthop should be a ethernet address, but current validation
      checks against 4 byte integers.
      
      Cc: stable@vger.kernel.org
      Fixes: 2ec600d6 ("nl80211/cfg80211: support for mesh, sta dumping")
      Signed-off-by: NMarkus Theil <markus.theil@tu-ilmenau.de>
      Link: https://lore.kernel.org/r/20191029093003.10355-1-markus.theil@tu-ilmenau.deSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
      1fab1b89
    • M
      nl80211: Disallow setting of HT for channel 14 · ec649fed
      Masashi Honma 提交于
      This patch disables setting of HT20 and more for channel 14 because
      the channel is only for IEEE 802.11b.
      
      The patch for net/wireless/util.c was unit-tested.
      
      The patch for net/wireless/chan.c was tested with iw command.
      
      Before this patch.
      $ sudo iw dev <ifname> set channel 14 HT20
      $
      
      After this patch.
      $ sudo iw dev <ifname> set channel 14 HT20
      kernel reports: invalid channel definition
      command failed: Invalid argument (-22)
      $
      Signed-off-by: NMasashi Honma <masashi.honma@gmail.com>
      Link: https://lore.kernel.org/r/20191021075045.2719-1-masashi.honma@gmail.com
      [clean up the code, use != instead of equivalent >]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      ec649fed
    • D
      Merge tag 'mlx5-fixes-2019-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 6f74a55d
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2019-10-24
      
      This series introduces misc fixes to mlx5 driver.
      
      v1->v2:
       - Dropped the kTLS counter documentation patch, Tariq will fix it and
         send it later.
       - Added a new fix for link speed mode reporting.
        ('net/mlx5e: Initialize link modes bitmap on stack')
      
      For -stable v4.14
        ('net/mlx5e: Fix handling of compressed CQEs in case of low NAPI budget')
      
      For -stable v4.19
        ('net/mlx5e: Fix ethtool self test: link speed')
      
      For -stable v5.2
        ('net/mlx5: Fix flow counter list auto bits struct')
        ('net/mlx5: Fix rtable reference leak')
      
      For -stable v5.3
        ('net/mlx5e: Remove incorrect match criteria assignment line')
        ('net/mlx5e: Determine source port properly for vlan push action')
        ('net/mlx5e: Initialize link modes bitmap on stack')
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f74a55d
    • N
      net: rtnetlink: fix a typo fbd -> fdb · 8b73018f
      Nikolay Aleksandrov 提交于
      A simple typo fix in the nl error message (fbd -> fdb).
      
      CC: David Ahern <dsahern@gmail.com>
      Fixes: 8c6e137f ("rtnetlink: Update rtnl_fdb_dump for strict data checking")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b73018f
    • U
      net/smc: fix refcounting for non-blocking connect() · 301428ea
      Ursula Braun 提交于
      If a nonblocking socket is immediately closed after connect(),
      the connect worker may not have started. This results in a refcount
      problem, since sock_hold() is called from the connect worker.
      This patch moves the sock_hold in front of the connect worker
      scheduling.
      
      Reported-by: syzbot+4c063e6dea39e4b79f29@syzkaller.appspotmail.com
      Fixes: 50717a37 ("net/smc: nonblocking connect rework")
      Reviewed-by: NKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      301428ea
    • T
      bonding: fix using uninitialized mode_lock · ad9bd8da
      Taehee Yoo 提交于
      When a bonding interface is being created, it setups its mode and options.
      At that moment, it uses mode_lock so mode_lock should be initialized
      before that moment.
      
      rtnl_newlink()
      	rtnl_create_link()
      		alloc_netdev_mqs()
      			->setup() //bond_setup()
      	->newlink //bond_newlink
      		bond_changelink()
      		register_netdevice()
      			->ndo_init() //bond_init()
      
      After commit 089bca2c ("bonding: use dynamic lockdep key instead of
      subclass"), mode_lock is initialized in bond_init().
      So in the bond_changelink(), un-initialized mode_lock can be used.
      mode_lock should be initialized in bond_setup().
      This patch partially reverts commit 089bca2c ("bonding: use dynamic
      lockdep key instead of subclass")
      
      Test command:
          ip link add bond0 type bond mode 802.3ad lacp_rate 0
      
      Splat looks like:
      [   60.615127] INFO: trying to register non-static key.
      [   60.615900] the code is fine but needs lockdep annotation.
      [   60.616697] turning off the locking correctness validator.
      [   60.617490] CPU: 1 PID: 957 Comm: ip Not tainted 5.4.0-rc3+ #109
      [   60.618350] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   60.619481] Call Trace:
      [   60.619918]  dump_stack+0x7c/0xbb
      [   60.620453]  register_lock_class+0x1215/0x14d0
      [   60.621131]  ? alloc_netdev_mqs+0x7b3/0xcc0
      [   60.621771]  ? is_bpf_text_address+0x86/0xf0
      [   60.622416]  ? is_dynamic_key+0x230/0x230
      [   60.623032]  ? unwind_get_return_address+0x5f/0xa0
      [   60.623757]  ? create_prof_cpu_mask+0x20/0x20
      [   60.624408]  ? arch_stack_walk+0x83/0xb0
      [   60.625023]  __lock_acquire+0xd8/0x3de0
      [   60.625616]  ? stack_trace_save+0x82/0xb0
      [   60.626225]  ? stack_trace_consume_entry+0x160/0x160
      [   60.626957]  ? deactivate_slab.isra.80+0x2c5/0x800
      [   60.627668]  ? register_lock_class+0x14d0/0x14d0
      [   60.628380]  ? alloc_netdev_mqs+0x7b3/0xcc0
      [   60.629020]  ? save_stack+0x69/0x80
      [   60.629574]  ? save_stack+0x19/0x80
      [   60.630121]  ? __kasan_kmalloc.constprop.4+0xa0/0xd0
      [   60.630859]  ? __kmalloc_node+0x16f/0x480
      [   60.631472]  ? alloc_netdev_mqs+0x7b3/0xcc0
      [   60.632121]  ? rtnl_create_link+0x2ed/0xad0
      [   60.634388]  ? __rtnl_newlink+0xad4/0x11b0
      [   60.635024]  lock_acquire+0x164/0x3b0
      [   60.635608]  ? bond_3ad_update_lacp_rate+0x91/0x200 [bonding]
      [   60.636463]  _raw_spin_lock_bh+0x38/0x70
      [   60.637084]  ? bond_3ad_update_lacp_rate+0x91/0x200 [bonding]
      [   60.637930]  bond_3ad_update_lacp_rate+0x91/0x200 [bonding]
      [   60.638753]  ? bond_3ad_lacpdu_recv+0xb30/0xb30 [bonding]
      [   60.639552]  ? bond_opt_get_val+0x180/0x180 [bonding]
      [   60.640307]  ? ___slab_alloc+0x5aa/0x610
      [   60.640925]  bond_option_lacp_rate_set+0x71/0x140 [bonding]
      [   60.641751]  __bond_opt_set+0x1ff/0xbb0 [bonding]
      [   60.643217]  ? kasan_unpoison_shadow+0x30/0x40
      [   60.643924]  bond_changelink+0x9a4/0x1700 [bonding]
      [   60.644653]  ? memset+0x1f/0x40
      [   60.742941]  ? bond_slave_changelink+0x1a0/0x1a0 [bonding]
      [   60.752694]  ? alloc_netdev_mqs+0x8ea/0xcc0
      [   60.753330]  ? rtnl_create_link+0x2ed/0xad0
      [   60.753964]  bond_newlink+0x1e/0x60 [bonding]
      [   60.754612]  __rtnl_newlink+0xb9f/0x11b0
      [ ... ]
      
      Reported-by: syzbot+8da67f407bcba2c72e6e@syzkaller.appspotmail.com
      Reported-by: syzbot+0d083911ab18b710da71@syzkaller.appspotmail.com
      Fixes: 089bca2c ("bonding: use dynamic lockdep key instead of subclass")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad9bd8da
    • A
      net: fec_ptp: Use platform_get_irq_xxx_optional() to avoid error message · b86bcb29
      Anson Huang 提交于
      Use platform_get_irq_byname_optional() and platform_get_irq_optional()
      instead of platform_get_irq_byname() and platform_get_irq() for optional
      IRQs to avoid below error message during probe:
      
      [    0.795803] fec 30be0000.ethernet: IRQ pps not found
      [    0.800787] fec 30be0000.ethernet: IRQ index 3 not found
      Signed-off-by: NAnson Huang <Anson.Huang@nxp.com>
      Acked-by: NFugang Duan <fugang.duan@nxp.com>
      Reviewed-by: NStephen Boyd <swboyd@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b86bcb29
    • A
      net: fec_main: Use platform_get_irq_byname_optional() to avoid error message · 3b56be21
      Anson Huang 提交于
      Failed to get irq using name is NOT fatal as driver will use index
      to get irq instead, use platform_get_irq_byname_optional() instead
      of platform_get_irq_byname() to avoid below error message during
      probe:
      
      [    0.819312] fec 30be0000.ethernet: IRQ int0 not found
      [    0.824433] fec 30be0000.ethernet: IRQ int1 not found
      [    0.829539] fec 30be0000.ethernet: IRQ int2 not found
      Signed-off-by: NAnson Huang <Anson.Huang@nxp.com>
      Acked-by: NFugang Duan <fugang.duan@nxp.com>
      Reviewed-by: NStephen Boyd <swboyd@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b56be21
    • J
      MAINTAINERS: remove Dave Watson as TLS maintainer · f9f29338
      Jakub Kicinski 提交于
      Dave's Facebook email address is not working, and my attempts
      to contact him are failing. Let's remove it to trim down the
      list of TLS maintainers.
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9f29338
    • X
      vxlan: check tun_info options_len properly · eadf52cf
      Xin Long 提交于
      This patch is to improve the tun_info options_len by dropping
      the skb when TUNNEL_VXLAN_OPT is set but options_len is less
      than vxlan_metadata. This can void a potential out-of-bounds
      access on ip_tun_info.
      
      Fixes: ee122c79 ("vxlan: Flow based tunneling")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eadf52cf
    • X
      erspan: fix the tun_info options_len check for erspan · 2eb8d6d2
      Xin Long 提交于
      The check for !md doens't really work for ip_tunnel_info_opts(info) which
      only does info + 1. Also to avoid out-of-bounds access on info, it should
      ensure options_len is not less than erspan_metadata in both erspan_xmit()
      and ip6erspan_tunnel_xmit().
      
      Fixes: 1a66a836 ("gre: add collect_md mode to ERSPAN tunnel")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2eb8d6d2
    • J
      net: hisilicon: Fix ping latency when deal with high throughput · e56bd641
      Jiangfeng Xiao 提交于
      This is due to error in over budget processing.
      When dealing with high throughput, the used buffers
      that exceeds the budget is not cleaned up. In addition,
      it takes a lot of cycles to clean up the used buffer,
      and then the buffer where the valid data is located can take effect.
      Signed-off-by: NJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e56bd641
    • E
      net/mlx4_core: Dynamically set guaranteed amount of counters per VF · e19868ef
      Eran Ben Elisha 提交于
      Prior to this patch, the amount of counters guaranteed per VF in the
      resource tracker was MLX4_VF_COUNTERS_PER_PORT * MLX4_MAX_PORTS. It was
      set regardless if the VF was single or dual port.
      This caused several VFs to have no guaranteed counters although the
      system could satisfy their request.
      
      The fix is to dynamically guarantee counters, based on each VF
      specification.
      
      Fixes: 9de92c60 ("net/mlx4_core: Adjust counter grant policy in the resource tracker")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e19868ef
    • A
      net/mlx5e: Initialize on stack link modes bitmap · 926b37f7
      Aya Levin 提交于
      Initialize link modes bitmap on stack before using it, otherwise the
      outcome of ethtool set link ksettings might have unexpected values.
      
      Fixes: 4b95840a ("net/mlx5e: Fix matching of speed to PRM link modes")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      926b37f7
    • A
      net/mlx5e: Fix ethtool self test: link speed · 534e7366
      Aya Levin 提交于
      Ethtool self test contains a test for link speed. This test reads the
      PTYS register and determines whether the current speed is valid or not.
      Change current implementation to use the function mlx5e_port_linkspeed()
      that does the same check and fails when speed is invalid. This code
      redundancy lead to a bug when mlx5e_port_linkspeed() was updated with
      expended speeds and the self test was not.
      
      Fixes: 2c81bfd5 ("net/mlx5e: Move port speed code from en_ethtool.c to en/port.c")
      Signed-off-by: NAya Levin <ayal@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      534e7366
    • M
      net/mlx5e: Fix handling of compressed CQEs in case of low NAPI budget · 9df86bdb
      Maxim Mikityanskiy 提交于
      When CQE compression is enabled, compressed CQEs use the following
      structure: a title is followed by one or many blocks, each containing 8
      mini CQEs (except the last, which may contain fewer mini CQEs).
      
      Due to NAPI budget restriction, a complete structure is not always
      parsed in one NAPI run, and some blocks with mini CQEs may be deferred
      to the next NAPI poll call - we have the mlx5e_decompress_cqes_cont call
      in the beginning of mlx5e_poll_rx_cq. However, if the budget is
      extremely low, some blocks may be left even after that, but the code
      that follows the mlx5e_decompress_cqes_cont call doesn't check it and
      assumes that a new CQE begins, which may not be the case. In such cases,
      random memory corruptions occur.
      
      An extremely low NAPI budget of 8 is used when busy_poll or busy_read is
      active.
      
      This commit adds a check to make sure that the previous compressed CQE
      has been completely parsed after mlx5e_decompress_cqes_cont, otherwise
      it prevents a new CQE from being fetched in the middle of a compressed
      CQE.
      
      This commit fixes random crashes in __build_skb, __page_pool_put_page
      and other not-related-directly places, that used to happen when both CQE
      compression and busy_poll/busy_read were enabled.
      
      Fixes: 7219ab34 ("net/mlx5e: CQE compression")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      9df86bdb
    • V
      net/mlx5e: Don't store direct pointer to action's tunnel info · 2a4b6526
      Vlad Buslov 提交于
      Geneve implementation changed mlx5 tc to user direct pointer to tunnel_key
      action's internal struct ip_tunnel_info instance. However, this leads to
      use-after-free error when initial filter that caused creation of new encap
      entry is deleted or when tunnel_key action is manually overwritten through
      action API. Moreover, with recent TC offloads API unlocking change struct
      flow_action_entry->tunnel point to temporal copy of tunnel info that is
      deallocated after filter is offloaded to hardware which causes bug to
      reproduce every time new filter is attached to existing encap entry with
      following KASAN bug:
      
      [  314.885555] ==================================================================
      [  314.886641] BUG: KASAN: use-after-free in memcmp+0x2c/0x60
      [  314.886864] Read of size 1 at addr ffff88886c746280 by task tc/2682
      
      [  314.887179] CPU: 22 PID: 2682 Comm: tc Not tainted 5.3.0-rc7+ #703
      [  314.887188] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [  314.887195] Call Trace:
      [  314.887215]  dump_stack+0x9a/0xf0
      [  314.887236]  print_address_description+0x67/0x323
      [  314.887248]  ? memcmp+0x2c/0x60
      [  314.887257]  ? memcmp+0x2c/0x60
      [  314.887272]  __kasan_report.cold+0x1a/0x3d
      [  314.887474]  ? __mlx5e_tc_del_fdb_peer_flow+0x100/0x1b0 [mlx5_core]
      [  314.887484]  ? memcmp+0x2c/0x60
      [  314.887509]  kasan_report+0xe/0x12
      [  314.887521]  memcmp+0x2c/0x60
      [  314.887662]  mlx5e_tc_add_fdb_flow+0x51b/0xbe0 [mlx5_core]
      [  314.887838]  ? mlx5e_encap_take+0x110/0x110 [mlx5_core]
      [  314.887902]  ? lockdep_init_map+0x87/0x2c0
      [  314.887924]  ? __init_waitqueue_head+0x4f/0x60
      [  314.888062]  ? mlx5e_alloc_flow.isra.0+0x18c/0x1c0 [mlx5_core]
      [  314.888207]  __mlx5e_add_fdb_flow+0x2d7/0x440 [mlx5_core]
      [  314.888359]  ? mlx5e_tc_update_neigh_used_value+0x6f0/0x6f0 [mlx5_core]
      [  314.888374]  ? match_held_lock+0x2e/0x240
      [  314.888537]  mlx5e_configure_flower+0x830/0x16a0 [mlx5_core]
      [  314.888702]  ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core]
      [  314.888713]  ? down_read+0x118/0x2c0
      [  314.888728]  ? down_read_killable+0x300/0x300
      [  314.888882]  ? mlx5e_rep_get_ethtool_stats+0x180/0x180 [mlx5_core]
      [  314.888899]  tc_setup_cb_add+0x127/0x270
      [  314.888937]  fl_hw_replace_filter+0x2ac/0x380 [cls_flower]
      [  314.888976]  ? fl_hw_destroy_filter+0x1b0/0x1b0 [cls_flower]
      [  314.888990]  ? fl_change+0xbcf/0x27ef [cls_flower]
      [  314.889030]  ? fl_change+0xa57/0x27ef [cls_flower]
      [  314.889069]  fl_change+0x16bd/0x27ef [cls_flower]
      [  314.889135]  ? __rhashtable_insert_fast.constprop.0+0xa00/0xa00 [cls_flower]
      [  314.889167]  ? __radix_tree_lookup+0xa4/0x130
      [  314.889200]  ? fl_get+0x169/0x240 [cls_flower]
      [  314.889218]  ? fl_walk+0x230/0x230 [cls_flower]
      [  314.889249]  tc_new_tfilter+0x5e1/0xd40
      [  314.889281]  ? __rhashtable_insert_fast.constprop.0+0xa00/0xa00 [cls_flower]
      [  314.889309]  ? tc_del_tfilter+0xa30/0xa30
      [  314.889335]  ? __lock_acquire+0x5b5/0x2460
      [  314.889378]  ? find_held_lock+0x85/0xa0
      [  314.889442]  ? tc_del_tfilter+0xa30/0xa30
      [  314.889465]  rtnetlink_rcv_msg+0x4ab/0x5f0
      [  314.889488]  ? rtnl_dellink+0x490/0x490
      [  314.889518]  ? lockdep_hardirqs_on+0x260/0x260
      [  314.889538]  ? netlink_deliver_tap+0xab/0x5a0
      [  314.889550]  ? match_held_lock+0x1b/0x240
      [  314.889575]  netlink_rcv_skb+0xd0/0x200
      [  314.889588]  ? rtnl_dellink+0x490/0x490
      [  314.889605]  ? netlink_ack+0x440/0x440
      [  314.889635]  ? netlink_deliver_tap+0x161/0x5a0
      [  314.889648]  ? lock_downgrade+0x360/0x360
      [  314.889657]  ? lock_acquire+0xe5/0x210
      [  314.889686]  netlink_unicast+0x296/0x350
      [  314.889707]  ? netlink_attachskb+0x390/0x390
      [  314.889726]  ? _copy_from_iter_full+0xe0/0x3a0
      [  314.889738]  ? __virt_addr_valid+0xbb/0x130
      [  314.889771]  netlink_sendmsg+0x394/0x600
      [  314.889800]  ? netlink_unicast+0x350/0x350
      [  314.889817]  ? move_addr_to_kernel.part.0+0x90/0x90
      [  314.889852]  ? netlink_unicast+0x350/0x350
      [  314.889872]  sock_sendmsg+0x96/0xa0
      [  314.889891]  ___sys_sendmsg+0x482/0x520
      [  314.889919]  ? copy_msghdr_from_user+0x250/0x250
      [  314.889930]  ? __fput+0x1fa/0x390
      [  314.889941]  ? task_work_run+0xb7/0xf0
      [  314.889957]  ? exit_to_usermode_loop+0x117/0x120
      [  314.889972]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  314.889982]  ? do_syscall_64+0x74/0xe0
      [  314.889992]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  314.890012]  ? mark_lock+0xac/0x9a0
      [  314.890028]  ? __lock_acquire+0x5b5/0x2460
      [  314.890053]  ? mark_lock+0xac/0x9a0
      [  314.890083]  ? __lock_acquire+0x5b5/0x2460
      [  314.890112]  ? match_held_lock+0x1b/0x240
      [  314.890144]  ? __fget_light+0xa1/0xf0
      [  314.890166]  ? sockfd_lookup_light+0x91/0xb0
      [  314.890187]  __sys_sendmsg+0xba/0x130
      [  314.890201]  ? __sys_sendmsg_sock+0xb0/0xb0
      [  314.890225]  ? __blkcg_punt_bio_submit+0xd0/0xd0
      [  314.890264]  ? lockdep_hardirqs_off+0xbe/0x100
      [  314.890274]  ? mark_held_locks+0x24/0x90
      [  314.890286]  ? do_syscall_64+0x1e/0xe0
      [  314.890308]  do_syscall_64+0x74/0xe0
      [  314.890325]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  314.890336] RIP: 0033:0x7f00ca33d7b8
      [  314.890348] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [  314.890356] RSP: 002b:00007ffea2983928 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  314.890369] RAX: ffffffffffffffda RBX: 000000005d777d5b RCX: 00007f00ca33d7b8
      [  314.890377] RDX: 0000000000000000 RSI: 00007ffea2983990 RDI: 0000000000000003
      [  314.890384] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
      [  314.890392] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [  314.890400] R13: 000000000047f640 R14: 00007ffea2987b58 R15: 0000000000000021
      
      [  314.890529] Allocated by task 2687:
      [  314.890684]  save_stack+0x1b/0x80
      [  314.890694]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [  314.890705]  __kmalloc_track_caller+0x102/0x340
      [  314.890721]  kmemdup+0x1d/0x40
      [  314.890730]  tc_setup_flow_action+0x731/0x2c27
      [  314.890743]  fl_hw_replace_filter+0x23b/0x380 [cls_flower]
      [  314.890756]  fl_change+0x16bd/0x27ef [cls_flower]
      [  314.890765]  tc_new_tfilter+0x5e1/0xd40
      [  314.890776]  rtnetlink_rcv_msg+0x4ab/0x5f0
      [  314.890786]  netlink_rcv_skb+0xd0/0x200
      [  314.890796]  netlink_unicast+0x296/0x350
      [  314.890805]  netlink_sendmsg+0x394/0x600
      [  314.890815]  sock_sendmsg+0x96/0xa0
      [  314.890825]  ___sys_sendmsg+0x482/0x520
      [  314.890834]  __sys_sendmsg+0xba/0x130
      [  314.890844]  do_syscall_64+0x74/0xe0
      [  314.890854]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [  314.890937] Freed by task 2687:
      [  314.891076]  save_stack+0x1b/0x80
      [  314.891086]  __kasan_slab_free+0x12c/0x170
      [  314.891095]  kfree+0xeb/0x2f0
      [  314.891106]  tc_cleanup_flow_action+0x69/0xa0
      [  314.891119]  fl_hw_replace_filter+0x2c5/0x380 [cls_flower]
      [  314.891132]  fl_change+0x16bd/0x27ef [cls_flower]
      [  314.891140]  tc_new_tfilter+0x5e1/0xd40
      [  314.891151]  rtnetlink_rcv_msg+0x4ab/0x5f0
      [  314.891161]  netlink_rcv_skb+0xd0/0x200
      [  314.891170]  netlink_unicast+0x296/0x350
      [  314.891180]  netlink_sendmsg+0x394/0x600
      [  314.891190]  sock_sendmsg+0x96/0xa0
      [  314.891200]  ___sys_sendmsg+0x482/0x520
      [  314.891208]  __sys_sendmsg+0xba/0x130
      [  314.891218]  do_syscall_64+0x74/0xe0
      [  314.891228]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [  314.891315] The buggy address belongs to the object at ffff88886c746280
                      which belongs to the cache kmalloc-96 of size 96
      [  314.891762] The buggy address is located 0 bytes inside of
                      96-byte region [ffff88886c746280, ffff88886c7462e0)
      [  314.892196] The buggy address belongs to the page:
      [  314.892387] page:ffffea0021b1d180 refcount:1 mapcount:0 mapping:ffff88835d00ef80 index:0x0
      [  314.892398] flags: 0x57ffffc0000200(slab)
      [  314.892413] raw: 0057ffffc0000200 ffffea00219e0340 0000000800000008 ffff88835d00ef80
      [  314.892423] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
      [  314.892430] page dumped because: kasan: bad access detected
      
      [  314.892515] Memory state around the buggy address:
      [  314.892707]  ffff88886c746180: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.892976]  ffff88886c746200: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.893251] >ffff88886c746280: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.893522]                    ^
      [  314.893657]  ffff88886c746300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  314.893924]  ffff88886c746380: 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc
      [  314.894189] ==================================================================
      
      Fix the issue by duplicating tunnel info into per-encap copy that is
      deallocated with encap structure. Also, duplicate tunnel info in flow parse
      attribute to support cases when flow might be attached asynchronously.
      
      Fixes: 1f6da306 ("net/mlx5e: Geneve, Keep tunnel info as pointer to the original struct")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NYevgeny Kliteynik <kliteyn@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      2a4b6526
    • E
      net/mlx5: Fix NULL pointer dereference in extended destination · 0fd79b1e
      Eli Britstein 提交于
      The cited commit refactored the encap id into a struct pointed from the
      destination.
      Bug fix for the case there is no encap for one of the destinations.
      
      Fixes: 2b688ea5 ("net/mlx5: Add flow steering actions to fs_cmd shim layer")
      Signed-off-by: NEli Britstein <elibr@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      0fd79b1e
    • P
      net/mlx5: Fix rtable reference leak · 2347cee8
      Parav Pandit 提交于
      If the rt entry gateway family is not AF_INET for multipath device,
      rtable reference is leaked.
      Hence, fix it by releasing the reference.
      
      Fixes: 5fb091e8 ("net/mlx5e: Use hint to resolve route when in HW multipath mode")
      Fixes: e32ee6c7 ("net/mlx5e: Support tunnel encap over tagged Ethernet")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      2347cee8
    • V
      net/mlx5e: Only skip encap flows update when encap init failed · 64d7b685
      Vlad Buslov 提交于
      When encap entry initialization completes successfully e->compl_result is
      set to positive value and not zero, like mlx5e_rep_update_flows() assumes
      at the moment. Fix the conditional to only skip encap flows update when
      e->compl_result < 0.
      
      Fixes: 2a1f1768 ("net/mlx5e: Refactor neigh update for concurrent execution")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      64d7b685
    • M
      net/mlx5e: Replace kfree with kvfree when free vhca stats · 5dfb6335
      Maor Gottlieb 提交于
      Memory allocated by kvzalloc should be freed by kvfree.
      
      Fixes: cef35af3 ("net/mlx5e: Add mlx5e HV VHCA stats agent")
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      5dfb6335
    • D
      net/mlx5e: Remove incorrect match criteria assignment line · 752d3dc0
      Dmytro Linkin 提交于
      Driver have function, which enable match criteria for misc parameters
      in dependence of eswitch capabilities.
      
      Fixes: 4f5d1bea ("Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux")
      Signed-off-by: NDmytro Linkin <dmitrolin@mellanox.com>
      Reviewed-by: NJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      752d3dc0
    • D
      net/mlx5e: Determine source port properly for vlan push action · d5dbcc4e
      Dmytro Linkin 提交于
      Termination tables are used for vlan push actions on uplink ports.
      To support RoCE dual port the source port value was placed in a register.
      Fix the code to use an API method returning the source port according to
      the FW capabilities.
      
      Fixes: 10caabda ("net/mlx5e: Use termination table for VLAN push actions")
      Signed-off-by: NDmytro Linkin <dmitrolin@mellanox.com>
      Reviewed-by: NJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: NOz Shlomo <ozsh@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      d5dbcc4e
    • R
      net/mlx5: Fix flow counter list auto bits struct · 6dfef396
      Roi Dayan 提交于
      The union should contain the extended dest and counter list.
      Remove the resevered 0x40 bits which is redundant.
      This change doesn't break any functionally.
      Everything works today because the code in fs_cmd.c is using
      the correct structs if extended dest or the basic dest.
      
      Fixes: 1b115498 ("net/mlx5: Introduce extended destination fields")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      6dfef396
    • D
      Merge branch 'VLAN-fixes-for-Ocelot-switch' · c1b5ddc1
      David S. Miller 提交于
      Vladimir Oltean says:
      
      ====================
      VLAN fixes for Ocelot switch
      
      This series addresses 2 issues with vlan_filtering=1:
      - Untagged traffic gets dropped unless commands are run in a very
        specific order.
      - Untagged traffic starts being transmitted as tagged after adding
        another untagged VID on the port.
      
      Tested on NXP LS1028A-RDB board.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1b5ddc1
    • V
      net: mscc: ocelot: refuse to overwrite the port's native vlan · b9cd75e6
      Vladimir Oltean 提交于
      The switch driver keeps a "vid" variable per port, which signifies _the_
      VLAN ID that is stripped on that port's egress (aka the native VLAN on a
      trunk port).
      
      That is the way the hardware is designed (mostly). The port->vid is
      programmed into REW:PORT:PORT_VLAN_CFG:PORT_VID and the rewriter is told
      to send all traffic as tagged except the one having port->vid.
      
      There exists a possibility of finer-grained egress untagging decisions:
      using the VCAP IS1 engine, one rule can be added to match every
      VLAN-tagged frame whose VLAN should be untagged, and set POP_CNT=1 as
      action. However, the IS1 can hold at most 512 entries, and the VLANs are
      in the order of 6 * 4096.
      
      So the code is fine for now. But this sequence of commands:
      
      $ bridge vlan add dev swp0 vid 1 pvid untagged
      $ bridge vlan add dev swp0 vid 2 untagged
      
      makes untagged and pvid-tagged traffic be sent out of swp0 as tagged
      with VID 1, despite user's request.
      
      Prevent that from happening. The user should temporarily remove the
      existing untagged VLAN (1 in this case), add it back as tagged, and then
      add the new untagged VLAN (2 in this case).
      
      Cc: Antoine Tenart <antoine.tenart@bootlin.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Fixes: 7142529f ("net: mscc: ocelot: add VLAN filtering")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9cd75e6
    • V
      net: mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is up · 1c44ce56
      Vladimir Oltean 提交于
      Background information: the driver operates the hardware in a mode where
      a single VLAN can be transmitted as untagged on a particular egress
      port. That is the "native VLAN on trunk port" use case. Its value is
      held in port->vid.
      
      Consider the following command sequence (no network manager, all
      interfaces are down, debugging prints added by me):
      
      $ ip link add dev br0 type bridge vlan_filtering 1
      $ ip link set dev swp0 master br0
      
      Kernel code path during last command:
      
      br_add_slave -> ocelot_netdevice_port_event (NETDEV_CHANGEUPPER):
      [   21.401901] ocelot_vlan_port_apply: port 0 vlan aware 0 pvid 0 vid 0
      
      br_add_slave -> nbp_vlan_init -> switchdev_port_attr_set -> ocelot_port_attr_set (SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING):
      [   21.413335] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 0 vid 0
      
      br_add_slave -> nbp_vlan_init -> nbp_vlan_add -> br_switchdev_port_vlan_add -> switchdev_port_obj_add -> ocelot_port_obj_add -> ocelot_vlan_vid_add
      [   21.667421] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 1
      
      So far so good. The bridge has replaced the driver's default pvid used
      in standalone mode (0) with its own default_pvid (1). The port's vid
      (native VLAN) has also changed from 0 to 1.
      
      $ ip link set dev swp0 up
      
      [   31.722956] 8021q: adding VLAN 0 to HW filter on device swp0
      do_setlink -> dev_change_flags -> vlan_vid_add -> ocelot_vlan_rx_add_vid -> ocelot_vlan_vid_add:
      [   31.728700] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 0
      
      The 8021q module uses the .ndo_vlan_rx_add_vid API on .ndo_open to make
      ports be able to transmit and receive 802.1p-tagged traffic by default.
      This API is supposed to offload a VLAN sub-interface, which for a switch
      port means to add a VLAN that is not a pvid, and tagged on egress.
      
      But the driver implementation of .ndo_vlan_rx_add_vid is wrong: it adds
      back vid 0 as "egress untagged". Now back to the initial paragraph:
      there is a single untagged VID that the driver keeps track of, and that
      has just changed from 1 (the pvid) to 0. So this breaks the bridge
      core's expectation, because it has changed vid 1 from untagged to
      tagged, when what the user sees is.
      
      $ bridge vlan
      port    vlan ids
      swp0     1 PVID Egress Untagged
      
      br0      1 PVID Egress Untagged
      
      But curiously, instead of manifesting itself as "untagged and
      pvid-tagged traffic gets sent as tagged on egress", the bug:
      
      - is hidden when vlan_filtering=0
      - manifests as dropped traffic when vlan_filtering=1, due to this setting:
      
      	if (port->vlan_aware && !port->vid)
      		/* If port is vlan-aware and tagged, drop untagged and priority
      		 * tagged frames.
      		 */
      		val |= ANA_PORT_DROP_CFG_DROP_UNTAGGED_ENA |
      		       ANA_PORT_DROP_CFG_DROP_PRIO_S_TAGGED_ENA |
      		       ANA_PORT_DROP_CFG_DROP_PRIO_C_TAGGED_ENA;
      
      which would have made sense if it weren't for this bug. The setting's
      intention was "this is a trunk port with no native VLAN, so don't accept
      untagged traffic". So the driver was never expecting to set VLAN 0 as
      the value of the native VLAN, 0 was just encoding for "invalid".
      
      So the fix is to not send 802.1p traffic as untagged, because that would
      change the port's native vlan to 0, unbeknownst to the bridge, and
      trigger unexpected code paths in the driver.
      
      Cc: Antoine Tenart <antoine.tenart@bootlin.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Fixes: 7142529f ("net: mscc: ocelot: add VLAN filtering")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Reviewed-by: NHoratiu Vultur <horatiu.vultur@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c44ce56