1. 09 4月, 2021 1 次提交
    • V
      net: sched: fix action overwrite reference counting · 87c750e8
      Vlad Buslov 提交于
      Action init code increments reference counter when it changes an action.
      This is the desired behavior for cls API which needs to obtain action
      reference for every classifier that points to action. However, act API just
      needs to change the action and releases the reference before returning.
      This sequence breaks when the requested action doesn't exist, which causes
      act API init code to create new action with specified index, but action is
      still released before returning and is deleted (unless it was referenced
      concurrently by cls API).
      
      Reproduction:
      
      $ sudo tc actions ls action gact
      $ sudo tc actions change action gact drop index 1
      $ sudo tc actions ls action gact
      
      Extend tcf_action_init() to accept 'init_res' array and initialize it with
      action->ops->init() result. In tcf_action_add() remove pointers to created
      actions from actions array before passing it to tcf_action_put_many().
      
      Fixes: cae422f3 ("net: sched: use reference counting action init")
      Reported-by: NKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87c750e8
  2. 03 4月, 2021 1 次提交
  3. 17 3月, 2021 1 次提交
  4. 17 2月, 2021 1 次提交
    • V
      net: sched: fix police ext initialization · 396d7f23
      Vlad Buslov 提交于
      When police action is created by cls API tcf_exts_validate() first
      conditional that calls tcf_action_init_1() directly, the action idr is not
      updated according to latest changes in action API that require caller to
      commit newly created action to idr with tcf_idr_insert_many(). This results
      such action not being accessible through act API and causes crash reported
      by syzbot:
      
      ==================================================================
      BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline]
      BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
      BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline]
      BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
      Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204
      
      CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       __kasan_report mm/kasan/report.c:400 [inline]
       kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413
       check_memory_region_inline mm/kasan/generic.c:179 [inline]
       check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
       instrument_atomic_read include/linux/instrumented.h:71 [inline]
       atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
       __tcf_idr_release net/sched/act_api.c:178 [inline]
       tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
       tc_action_net_exit include/net/act_api.h:151 [inline]
       police_exit_net+0x168/0x360 net/sched/act_police.c:390
       ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
       cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
       process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
       worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
       kthread+0x3b1/0x4a0 kernel/kthread.c:292
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
      ==================================================================
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G    B             5.11.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       panic+0x306/0x73d kernel/panic.c:231
       end_report+0x58/0x5e mm/kasan/report.c:100
       __kasan_report mm/kasan/report.c:403 [inline]
       kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413
       check_memory_region_inline mm/kasan/generic.c:179 [inline]
       check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
       instrument_atomic_read include/linux/instrumented.h:71 [inline]
       atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
       __tcf_idr_release net/sched/act_api.c:178 [inline]
       tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
       tc_action_net_exit include/net/act_api.h:151 [inline]
       police_exit_net+0x168/0x360 net/sched/act_police.c:390
       ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
       cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
       process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
       worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
       kthread+0x3b1/0x4a0 kernel/kthread.c:292
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
      Kernel Offset: disabled
      
      Fix the issue by calling tcf_idr_insert_many() after successful action
      initialization.
      
      Fixes: 0fedc63f ("net_sched: commit action insertions together")
      Reported-by: syzbot+151e3e714d34ae4ce7e8@syzkaller.appspotmail.com
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      396d7f23
  5. 19 1月, 2021 1 次提交
    • C
      net_sched: fix RTNL deadlock again caused by request_module() · d349f997
      Cong Wang 提交于
      tcf_action_init_1() loads tc action modules automatically with
      request_module() after parsing the tc action names, and it drops RTNL
      lock and re-holds it before and after request_module(). This causes a
      lot of troubles, as discovered by syzbot, because we can be in the
      middle of batch initializations when we create an array of tc actions.
      
      One of the problem is deadlock:
      
      CPU 0					CPU 1
      rtnl_lock();
      for (...) {
        tcf_action_init_1();
          -> rtnl_unlock();
          -> request_module();
      				rtnl_lock();
      				for (...) {
      				  tcf_action_init_1();
      				    -> tcf_idr_check_alloc();
      				   // Insert one action into idr,
      				   // but it is not committed until
      				   // tcf_idr_insert_many(), then drop
      				   // the RTNL lock in the _next_
      				   // iteration
      				   -> rtnl_unlock();
          -> rtnl_lock();
          -> a_o->init();
            -> tcf_idr_check_alloc();
            // Now waiting for the same index
            // to be committed
      				    -> request_module();
      				    -> rtnl_lock()
      				    // Now waiting for RTNL lock
      				}
      				rtnl_unlock();
      }
      rtnl_unlock();
      
      This is not easy to solve, we can move the request_module() before
      this loop and pre-load all the modules we need for this netlink
      message and then do the rest initializations. So the loop breaks down
      to two now:
      
              for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
                      struct tc_action_ops *a_o;
      
                      a_o = tc_action_load_ops(name, tb[i]...);
                      ops[i - 1] = a_o;
              }
      
              for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
                      act = tcf_action_init_1(ops[i - 1]...);
              }
      
      Although this looks serious, it only has been reported by syzbot, so it
      seems hard to trigger this by humans. And given the size of this patch,
      I'd suggest to make it to net-next and not to backport to stable.
      
      This patch has been tested by syzbot and tested with tdc.py by me.
      
      Fixes: 0fedc63f ("net_sched: commit action insertions together")
      Reported-and-tested-by: syzbot+82752bc5331601cf4899@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+b3b63b6bff456bd95294@syzkaller.appspotmail.com
      Reported-by: syzbot+ba67b12b1ca729912834@syzkaller.appspotmail.com
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <cong.wang@bytedance.com>
      Tested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20210117005657.14810-1-xiyou.wangcong@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      d349f997
  6. 02 12月, 2020 1 次提交
  7. 17 11月, 2020 2 次提交
  8. 31 10月, 2020 1 次提交
  9. 28 10月, 2020 1 次提交
    • L
      net: protect tcf_block_unbind with block lock · d6535dca
      Leon Romanovsky 提交于
      The tcf_block_unbind() expects that the caller will take block->cb_lock
      before calling it, however the code took RTNL lock and dropped cb_lock
      instead. This causes to the following kernel panic.
      
       WARNING: CPU: 1 PID: 13524 at net/sched/cls_api.c:1488 tcf_block_unbind+0x2db/0x420
       Modules linked in: mlx5_ib mlx5_core mlxfw ptp pps_core act_mirred act_tunnel_key cls_flower vxlan ip6_udp_tunnel udp_tunnel dummy sch_ingress openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad ib_ipoib rdma_cm iw_cm ib_cm ib_uverbs ib_core overlay [last unloaded: mlxfw]
       CPU: 1 PID: 13524 Comm: test-ecmp-add-v Tainted: G        W         5.9.0+ #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       RIP: 0010:tcf_block_unbind+0x2db/0x420
       Code: ff 48 83 c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8d bc 24 30 01 00 00 be ff ff ff ff e8 7d 7f 70 00 85 c0 0f 85 7b fd ff ff <0f> 0b e9 74 fd ff ff 48 c7 c7 dc 6a 24 84 e8 02 ec fe fe e9 55 fd
       RSP: 0018:ffff888117d17968 EFLAGS: 00010246
       RAX: 0000000000000000 RBX: ffff88812f713c00 RCX: 1ffffffff0848d5b
       RDX: 0000000000000001 RSI: ffff88814fbc8130 RDI: ffff888107f2b878
       RBP: 1ffff11022fa2f3f R08: 0000000000000000 R09: ffffffff84115a87
       R10: fffffbfff0822b50 R11: ffff888107f2b898 R12: ffff88814fbc8000
       R13: ffff88812f713c10 R14: ffff888117d17a38 R15: ffff88814fbc80c0
       FS:  00007f6593d36740(0000) GS:ffff8882a4f00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00005607a00758f8 CR3: 0000000131aea006 CR4: 0000000000170ea0
       Call Trace:
        tc_block_indr_cleanup+0x3e0/0x5a0
        ? tcf_block_unbind+0x420/0x420
        ? __mutex_unlock_slowpath+0xe7/0x610
        flow_indr_dev_unregister+0x5e2/0x930
        ? mlx5e_restore_tunnel+0xdf0/0xdf0 [mlx5_core]
        ? mlx5e_restore_tunnel+0xdf0/0xdf0 [mlx5_core]
        ? flow_indr_block_cb_alloc+0x3c0/0x3c0
        ? mlx5_db_free+0x37c/0x4b0 [mlx5_core]
        mlx5e_cleanup_rep_tx+0x8b/0xc0 [mlx5_core]
        mlx5e_detach_netdev+0xe5/0x120 [mlx5_core]
        mlx5e_vport_rep_unload+0x155/0x260 [mlx5_core]
        esw_offloads_disable+0x227/0x2b0 [mlx5_core]
        mlx5_eswitch_disable_locked.cold+0x38e/0x699 [mlx5_core]
        mlx5_eswitch_disable+0x94/0xf0 [mlx5_core]
        mlx5_device_disable_sriov+0x183/0x1f0 [mlx5_core]
        mlx5_core_sriov_configure+0xfd/0x230 [mlx5_core]
        sriov_numvfs_store+0x261/0x2f0
        ? sriov_drivers_autoprobe_store+0x110/0x110
        ? sysfs_file_ops+0x170/0x170
        ? sysfs_file_ops+0x117/0x170
        ? sysfs_file_ops+0x170/0x170
        kernfs_fop_write+0x1ff/0x3f0
        ? rcu_read_lock_any_held+0x6e/0x90
        vfs_write+0x1f3/0x620
        ksys_write+0xf9/0x1d0
        ? __x64_sys_read+0xb0/0xb0
        ? lockdep_hardirqs_on_prepare+0x273/0x3f0
        ? syscall_enter_from_user_mode+0x1d/0x50
        do_syscall_64+0x2d/0x40
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      <...>
      
       ---[ end trace bfdd028ada702879 ]---
      
      Fixes: 0fdcf78d ("net: use flow_indr_dev_setup_offload()")
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Link: https://lore.kernel.org/r/20201026123327.1141066-1-leon@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      d6535dca
  10. 21 10月, 2020 1 次提交
  11. 04 8月, 2020 1 次提交
    • W
      net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct · 038ebb1a
      wenxu 提交于
      When openvswitch conntrack offload with act_ct action. Fragment packets
      defrag in the ingress tc act_ct action and miss the next chain. Then the
      packet pass to the openvswitch datapath without the mru. The over
      mtu packet will be dropped in output action in openvswitch for over mtu.
      
      "kernel: net2: dropped over-mtu packet: 1528 > 1500"
      
      This patch add mru in the tc_skb_ext for adefrag and miss next chain
      situation. And also add mru in the qdisc_skb_cb. The act_ct set the mru
      to the qdisc_skb_cb when the packet defrag. And When the chain miss,
      The mru is set to tc_skb_ext which can be got by ovs datapath.
      
      Fixes: b57dc7c1 ("net/sched: Introduce action ct")
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Reviewed-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      038ebb1a
  12. 25 7月, 2020 1 次提交
  13. 17 7月, 2020 1 次提交
    • P
      net: sched: Do not drop root lock in tcf_qevent_handle() · 55f656cd
      Petr Machata 提交于
      Mirred currently does not mix well with blocks executed after the qdisc
      root lock is taken. This includes classification blocks (such as in PRIO,
      ETS, DRR qdiscs) and qevents. The locking caused by the packet mirrored by
      mirred can cause deadlocks: either when the thread of execution attempts to
      take the lock a second time, or when two threads end up waiting on each
      other's locks.
      
      The qevent patchset attempted to not introduce further badness of this
      sort, and dropped the lock before executing the qevent block. However this
      lead to too little locking and races between qdisc configuration and packet
      enqueue in the RED qdisc.
      
      Before the deadlock issues are solved in a way that can be applied across
      many qdiscs reasonably easily, do for qevents what is done for the
      classification blocks and just keep holding the root lock.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      55f656cd
  14. 14 7月, 2020 1 次提交
    • P
      net: sched: Pass qdisc reference in struct flow_block_offload · c40f4e50
      Petr Machata 提交于
      Previously, shared blocks were only relevant for the pseudo-qdiscs ingress
      and clsact. Recently, a qevent facility was introduced, which allows to
      bind blocks to well-defined slots of a qdisc instance. RED in particular
      got two qevents: early_drop and mark. Drivers that wish to offload these
      blocks will be sent the usual notification, and need to know which qdisc it
      is related to.
      
      To that end, extend flow_block_offload with a "sch" pointer, and initialize
      as appropriate. This prompts changes in the indirect block facility, which
      now tracks the scheduler in addition to the netdevice. Update signatures of
      several functions similarly.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c40f4e50
  15. 04 7月, 2020 1 次提交
    • T
      sched: consistently handle layer3 header accesses in the presence of VLANs · d7bf2ebe
      Toke Høiland-Jørgensen 提交于
      There are a couple of places in net/sched/ that check skb->protocol and act
      on the value there. However, in the presence of VLAN tags, the value stored
      in skb->protocol can be inconsistent based on whether VLAN acceleration is
      enabled. The commit quoted in the Fixes tag below fixed the users of
      skb->protocol to use a helper that will always see the VLAN ethertype.
      
      However, most of the callers don't actually handle the VLAN ethertype, but
      expect to find the IP header type in the protocol field. This means that
      things like changing the ECN field, or parsing diffserv values, stops
      working if there's a VLAN tag, or if there are multiple nested VLAN
      tags (QinQ).
      
      To fix this, change the helper to take an argument that indicates whether
      the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
      make sure to skip all of them, so behaviour is consistent even in QinQ
      mode.
      
      To make the helper usable from the ECN code, move it to if_vlan.h instead
      of pkt_sched.h.
      
      v3:
      - Remove empty lines
      - Move vlan variable definitions inside loop in skb_protocol()
      - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
        bpf_skb_ecn_set_ce()
      
      v2:
      - Use eth_type_vlan() helper in skb_protocol()
      - Also fix code that reads skb->protocol directly
      - Change a couple of 'if/else if' statements to switch constructs to avoid
        calling the helper twice
      Reported-by: NIlya Ponetayev <i.ponetaev@ndmsystems.com>
      Fixes: d8b9605d ("net: sched: fix skb->protocol use in case of accelerated vlan path")
      Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7bf2ebe
  16. 30 6月, 2020 2 次提交
    • P
      net:qos: police action offloading parameter 'burst' change to the original value · 5f035af7
      Po Liu 提交于
      Since 'tcfp_burst' with TICK factor, driver side always need to recover
      it to the original value, this patch moves the generic calculation and
      recover to the 'burst' original value before offloading to device driver.
      Signed-off-by: NPo Liu <po.liu@nxp.com>
      Acked-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f035af7
    • P
      net: sched: Introduce helpers for qevent blocks · 3625750f
      Petr Machata 提交于
      Qevents are attach points for TC blocks, where filters can be put that are
      executed when "interesting events" take place in a qdisc. The data to keep
      and the functions to invoke to maintain a qevent will be largely the same
      between qevents. Therefore introduce sched-wide helpers for qevent
      management.
      
      Currently, similarly to ingress and egress blocks of clsact pseudo-qdisc,
      blocks attachment cannot be changed after the qdisc is created. To that
      end, add a helper tcf_qevent_validate_change(), which verifies whether
      block index attribute is not attached, or if it is, whether its value
      matches the current one (i.e. there is no material change).
      
      The function tcf_qevent_handle() should be invoked when qdisc hits the
      "interesting event" corresponding to a block. This function releases root
      lock for the duration of executing the attached filters, to allow packets
      generated through user actions (notably mirred) to be reinserted to the
      same qdisc tree.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3625750f
  17. 25 6月, 2020 2 次提交
  18. 20 6月, 2020 2 次提交
    • W
      net/sched: cls_api: fix nooffloaddevcnt warning dmesg log · 3c005110
      wenxu 提交于
      The block->nooffloaddevcnt should always count for indr block.
      even the indr block offload successful. The representor maybe
      gone away and the ingress qdisc can work in software mode.
      
      block->nooffloaddevcnt warning with following dmesg log:
      
      [  760.667058] #####################################################
      [  760.668186] ## TEST test-ecmp-add-vxlan-encap-disable-sriov.sh ##
      [  760.669179] #####################################################
      [  761.780655] :test: Fedora 30 (Thirty)
      [  761.783794] :test: Linux reg-r-vrt-018-180 5.7.0+
      [  761.822890] :test: NIC ens1f0 FW 16.26.6000 PCI 0000:81:00.0 DEVICE 0x1019 ConnectX-5 Ex
      [  761.860244] mlx5_core 0000:81:00.0 ens1f0: Link up
      [  761.880693] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
      [  762.059732] mlx5_core 0000:81:00.1 ens1f1: Link up
      [  762.234341] :test: unbind vfs of ens1f0
      [  762.257825] :test: Change ens1f0 eswitch (0000:81:00.0) mode to switchdev
      [  762.291363] :test: unbind vfs of ens1f1
      [  762.306914] :test: Change ens1f1 eswitch (0000:81:00.1) mode to switchdev
      [  762.309237] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(LEGACY), nvfs(2), active vports(3)
      [  763.282598] mlx5_core 0000:81:00.1: E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295
      [  763.362825] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
      [  763.444465] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
      [  763.460088] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
      [  763.502586] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
      [  763.552429] ens1f1_0: renamed from eth0
      [  763.569569] mlx5_core 0000:81:00.1: E-Switch: Enable: mode(OFFLOADS), nvfs(2), active vports(3)
      [  763.629694] ens1f1_1: renamed from eth1
      [  764.631552] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_0: link becomes ready
      [  764.670841] :test: unbind vfs of ens1f0
      [  764.681966] :test: unbind vfs of ens1f1
      [  764.726762] mlx5_core 0000:81:00.0 ens1f0: Link up
      [  764.766511] mlx5_core 0000:81:00.1 ens1f1: Link up
      [  764.797325] :test: Add multipath vxlan encap rule and disable sriov
      [  764.798544] :test: config multipath route
      [  764.812732] mlx5_core 0000:81:00.0: lag map port 1:2 port 2:2
      [  764.874556] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2
      [  765.603681] :test: OK
      [  765.659048] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_1: link becomes ready
      [  765.675085] :test: verify rule in hw
      [  765.694237] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
      [  765.711892] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1: link becomes ready
      [  766.979230] :test: OK
      [  768.125419] :test: OK
      [  768.127519] :test: - disable sriov ens1f1
      [  768.131160] pci 0000:81:02.2: Removing from iommu group 75
      [  768.132646] pci 0000:81:02.3: Removing from iommu group 76
      [  769.179749] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
      [  769.455627] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:1
      [  769.703990] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
      [  769.988637] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
      [  769.990022] :test: - disable sriov ens1f0
      [  769.994922] pci 0000:81:00.2: Removing from iommu group 73
      [  769.997048] pci 0000:81:00.3: Removing from iommu group 74
      [  771.035813] mlx5_core 0000:81:00.0: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
      [  771.339091] ------------[ cut here ]------------
      [  771.340812] WARNING: CPU: 6 PID: 3448 at net/sched/cls_api.c:749 tcf_block_offload_unbind.isra.0+0x5c/0x60
      [  771.341728] Modules linked in: act_mirred act_tunnel_key cls_flower dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlxfw act_ct nf_flow_table kvm_intel nf_nat kvm nf_conntrack irqbypass crct10dif_pclmul igb crc32_pclmul nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 crc32c_intel ghash_clmulni_intel ptp ipmi_ssif intel_cstate pps_c
      ore ses intel_uncore mei_me iTCO_wdt joydev ipmi_si iTCO_vendor_support i2c_i801 enclosure mei ioatdma dca lpc_ich wmi ipmi_devintf pcspkr acpi_power_meter ipmi_msghandler acpi_pad ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
      [  771.347818] CPU: 6 PID: 3448 Comm: test-ecmp-add-v Not tainted 5.7.0+ #1146
      [  771.348727] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [  771.349646] RIP: 0010:tcf_block_offload_unbind.isra.0+0x5c/0x60
      [  771.350553] Code: 4a fd ff ff 83 f8 a1 74 0e 5b 4c 89 e7 5d 41 5c 41 5d e9 07 93 89 ff 8b 83 a0 00 00 00 8d 50 ff 89 93 a0 00 00 00 85 c0 75 df <0f> 0b eb db 0f 1f 44 00 00 41 57 41 56 41 55 41 89 cd 41 54 49 89
      [  771.352420] RSP: 0018:ffffb33144cd3b00 EFLAGS: 00010246
      [  771.353353] RAX: 0000000000000000 RBX: ffff8b37cf4b2800 RCX: 0000000000000000
      [  771.354294] RDX: 00000000ffffffff RSI: ffff8b3b9aad0000 RDI: ffffffff8d5c6e20
      [  771.355245] RBP: ffff8b37eb546948 R08: ffffffffc0b7a348 R09: ffff8b3b9aad0000
      [  771.356189] R10: 0000000000000001 R11: ffff8b3ba7a0a1c0 R12: ffff8b37cf4b2850
      [  771.357123] R13: ffff8b3b9aad0000 R14: ffff8b37cf4b2820 R15: ffff8b37cf4b2820
      [  771.358039] FS:  00007f8a19b6e740(0000) GS:ffff8b3befa00000(0000) knlGS:0000000000000000
      [  771.358965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  771.359885] CR2: 00007f3afb91c1a0 CR3: 000000045133c004 CR4: 00000000001606e0
      [  771.360825] Call Trace:
      [  771.361764]  __tcf_block_put+0x84/0x150
      [  771.362712]  ingress_destroy+0x1b/0x20 [sch_ingress]
      [  771.363658]  qdisc_destroy+0x3e/0xc0
      [  771.364594]  dev_shutdown+0x7a/0xa5
      [  771.365522]  rollback_registered_many+0x20d/0x530
      [  771.366458]  ? netdev_upper_dev_unlink+0x15d/0x1c0
      [  771.367387]  unregister_netdevice_many.part.0+0xf/0x70
      [  771.368310]  vxlan_netdevice_event+0xa4/0x110 [vxlan]
      [  771.369454]  notifier_call_chain+0x4c/0x70
      [  771.370579]  rollback_registered_many+0x2f5/0x530
      [  771.371719]  rollback_registered+0x56/0x90
      [  771.372843]  unregister_netdevice_queue+0x73/0xb0
      [  771.373982]  unregister_netdev+0x18/0x20
      [  771.375168]  mlx5e_vport_rep_unload+0x56/0xc0 [mlx5_core]
      [  771.376327]  esw_offloads_disable+0x81/0x90 [mlx5_core]
      [  771.377512]  mlx5_eswitch_disable_locked.cold+0xcb/0x1af [mlx5_core]
      [  771.378679]  mlx5_eswitch_disable+0x44/0x60 [mlx5_core]
      [  771.379822]  mlx5_device_disable_sriov+0xad/0xb0 [mlx5_core]
      [  771.380968]  mlx5_core_sriov_configure+0xc1/0xe0 [mlx5_core]
      [  771.382087]  sriov_numvfs_store+0xfc/0x130
      [  771.383195]  kernfs_fop_write+0xce/0x1b0
      [  771.384302]  vfs_write+0xb6/0x1a0
      [  771.385410]  ksys_write+0x5f/0xe0
      [  771.386500]  do_syscall_64+0x5b/0x1d0
      [  771.387569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 0fdcf78d ("net: use flow_indr_dev_setup_offload()")
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c005110
    • W
      net: flow_offload: fix flow_indr_dev_unregister path · a1db2178
      wenxu 提交于
      If the representor is removed, then identify the indirect flow_blocks
      that need to be removed by the release callback and the port representor
      structure. To identify the port representor structure, a new
      indr.cb_priv field needs to be introduced. The flow_block also needs to
      be removed from the driver list from the cleanup path.
      
      Fixes: 1fac52da ("net: flow_offload: consolidate indirect flow_block infrastructure")
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1db2178
  19. 02 6月, 2020 3 次提交
  20. 16 5月, 2020 2 次提交
  21. 07 5月, 2020 1 次提交
    • P
      net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE · 16f80360
      Pablo Neira Ayuso 提交于
      This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
      that the frontend does not need counters, this hw stats type request
      never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
      the driver to disable the stats, however, if the driver cannot disable
      counters, it bails out.
      
      TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
      except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
      (this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
      TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.
      
      Fixes: 319a1d19 ("flow_offload: check for basic action hw stats type")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16f80360
  22. 05 5月, 2020 1 次提交
    • C
      net_sched: fix tcm_parent in tc filter dump · a7df4870
      Cong Wang 提交于
      When we tell kernel to dump filters from root (ffff:ffff),
      those filters on ingress (ffff:0000) are matched, but their
      true parents must be dumped as they are. However, kernel
      dumps just whatever we tell it, that is either ffff:ffff
      or ffff:0000:
      
       $ nl-cls-list --dev=dummy0 --parent=root
       cls basic dev dummy0 id none parent root prio 49152 protocol ip match-all
       cls basic dev dummy0 id :1 parent root prio 49152 protocol ip match-all
       $ nl-cls-list --dev=dummy0 --parent=ffff:
       cls basic dev dummy0 id none parent ffff: prio 49152 protocol ip match-all
       cls basic dev dummy0 id :1 parent ffff: prio 49152 protocol ip match-all
      
      This is confusing and misleading, more importantly this is
      a regression since 4.15, so the old behavior must be restored.
      
      And, when tc filters are installed on a tc class, the parent
      should be the classid, rather than the qdisc handle. Commit
      edf6711c ("net: sched: remove classid and q fields from tcf_proto")
      removed the classid we save for filters, we can just restore
      this classid in tcf_block.
      
      Steps to reproduce this:
       ip li set dev dummy0 up
       tc qd add dev dummy0 ingress
       tc filter add dev dummy0 parent ffff: protocol arp basic action pass
       tc filter show dev dummy0 root
      
      Before this patch:
       filter protocol arp pref 49152 basic
       filter protocol arp pref 49152 basic handle 0x1
      	action order 1: gact action pass
      	 random type none pass val 0
      	 index 1 ref 1 bind 1
      
      After this patch:
       filter parent ffff: protocol arp pref 49152 basic
       filter parent ffff: protocol arp pref 49152 basic handle 0x1
       	action order 1: gact action pass
       	 random type none pass val 0
      	 index 1 ref 1 bind 1
      
      Fixes: a10fa201 ("net: sched: propagate q and parent from caller down to tcf_fill_node")
      Fixes: edf6711c ("net: sched: remove classid and q fields from tcf_proto")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7df4870
  23. 02 5月, 2020 1 次提交
  24. 25 4月, 2020 1 次提交
  25. 08 4月, 2020 1 次提交
  26. 28 3月, 2020 1 次提交
  27. 24 3月, 2020 1 次提交
  28. 20 3月, 2020 1 次提交
  29. 19 3月, 2020 1 次提交
    • P
      net: sched: Fix hw_stats_type setting in pedit loop · 2c4b58dc
      Petr Machata 提交于
      In the commit referenced below, hw_stats_type of an entry is set for every
      entry that corresponds to a pedit action. However, the assignment is only
      done after the entry pointer is bumped, and therefore could overwrite
      memory outside of the entries array.
      
      The reason for this positioning may have been that the current entry's
      hw_stats_type is already set above, before the action-type dispatch.
      However, if there are no more actions, the assignment is wrong. And if
      there are, the next round of the for_each_action loop will make the
      assignment before the action-type dispatch anyway.
      
      Therefore fix this issue by simply reordering the two lines.
      
      Fixes: 74522e7b ("net: sched: set the hw_stats_type in pedit loop")
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c4b58dc
  30. 18 3月, 2020 1 次提交
  31. 16 3月, 2020 1 次提交
  32. 13 3月, 2020 1 次提交
  33. 09 3月, 2020 1 次提交