1. 02 6月, 2020 3 次提交
  2. 16 5月, 2020 2 次提交
  3. 07 5月, 2020 1 次提交
    • P
      net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE · 16f80360
      Pablo Neira Ayuso 提交于
      This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
      that the frontend does not need counters, this hw stats type request
      never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
      the driver to disable the stats, however, if the driver cannot disable
      counters, it bails out.
      
      TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
      except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
      (this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
      TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.
      
      Fixes: 319a1d19 ("flow_offload: check for basic action hw stats type")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16f80360
  4. 05 5月, 2020 1 次提交
    • C
      net_sched: fix tcm_parent in tc filter dump · a7df4870
      Cong Wang 提交于
      When we tell kernel to dump filters from root (ffff:ffff),
      those filters on ingress (ffff:0000) are matched, but their
      true parents must be dumped as they are. However, kernel
      dumps just whatever we tell it, that is either ffff:ffff
      or ffff:0000:
      
       $ nl-cls-list --dev=dummy0 --parent=root
       cls basic dev dummy0 id none parent root prio 49152 protocol ip match-all
       cls basic dev dummy0 id :1 parent root prio 49152 protocol ip match-all
       $ nl-cls-list --dev=dummy0 --parent=ffff:
       cls basic dev dummy0 id none parent ffff: prio 49152 protocol ip match-all
       cls basic dev dummy0 id :1 parent ffff: prio 49152 protocol ip match-all
      
      This is confusing and misleading, more importantly this is
      a regression since 4.15, so the old behavior must be restored.
      
      And, when tc filters are installed on a tc class, the parent
      should be the classid, rather than the qdisc handle. Commit
      edf6711c ("net: sched: remove classid and q fields from tcf_proto")
      removed the classid we save for filters, we can just restore
      this classid in tcf_block.
      
      Steps to reproduce this:
       ip li set dev dummy0 up
       tc qd add dev dummy0 ingress
       tc filter add dev dummy0 parent ffff: protocol arp basic action pass
       tc filter show dev dummy0 root
      
      Before this patch:
       filter protocol arp pref 49152 basic
       filter protocol arp pref 49152 basic handle 0x1
      	action order 1: gact action pass
      	 random type none pass val 0
      	 index 1 ref 1 bind 1
      
      After this patch:
       filter parent ffff: protocol arp pref 49152 basic
       filter parent ffff: protocol arp pref 49152 basic handle 0x1
       	action order 1: gact action pass
       	 random type none pass val 0
      	 index 1 ref 1 bind 1
      
      Fixes: a10fa201 ("net: sched: propagate q and parent from caller down to tcf_fill_node")
      Fixes: edf6711c ("net: sched: remove classid and q fields from tcf_proto")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7df4870
  5. 02 5月, 2020 1 次提交
  6. 25 4月, 2020 1 次提交
  7. 08 4月, 2020 1 次提交
  8. 28 3月, 2020 1 次提交
  9. 24 3月, 2020 1 次提交
  10. 20 3月, 2020 1 次提交
  11. 19 3月, 2020 1 次提交
    • P
      net: sched: Fix hw_stats_type setting in pedit loop · 2c4b58dc
      Petr Machata 提交于
      In the commit referenced below, hw_stats_type of an entry is set for every
      entry that corresponds to a pedit action. However, the assignment is only
      done after the entry pointer is bumped, and therefore could overwrite
      memory outside of the entries array.
      
      The reason for this positioning may have been that the current entry's
      hw_stats_type is already set above, before the action-type dispatch.
      However, if there are no more actions, the assignment is wrong. And if
      there are, the next round of the for_each_action loop will make the
      assignment before the action-type dispatch anyway.
      
      Therefore fix this issue by simply reordering the two lines.
      
      Fixes: 74522e7b ("net: sched: set the hw_stats_type in pedit loop")
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c4b58dc
  12. 18 3月, 2020 1 次提交
  13. 16 3月, 2020 1 次提交
  14. 13 3月, 2020 1 次提交
  15. 09 3月, 2020 1 次提交
  16. 26 2月, 2020 1 次提交
  17. 20 2月, 2020 4 次提交
  18. 18 2月, 2020 2 次提交
  19. 23 1月, 2020 1 次提交
    • E
      net_sched: use validated TCA_KIND attribute in tc_new_tfilter() · 36d79af7
      Eric Dumazet 提交于
      sysbot found another issue in tc_new_tfilter().
      We probably should use @name which contains the sanitized
      version of TCA_KIND.
      
      BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:608 [inline]
      BUG: KMSAN: uninit-value in string+0x522/0x690 lib/vsprintf.c:689
      CPU: 1 PID: 10753 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       string_nocheck lib/vsprintf.c:608 [inline]
       string+0x522/0x690 lib/vsprintf.c:689
       vsnprintf+0x207d/0x31b0 lib/vsprintf.c:2574
       __request_module+0x2ad/0x11c0 kernel/kmod.c:143
       tcf_proto_lookup_ops+0x241/0x720 net/sched/cls_api.c:139
       tcf_proto_create net/sched/cls_api.c:262 [inline]
       tc_new_tfilter+0x2a4e/0x5010 net/sched/cls_api.c:2058
       rtnetlink_rcv_msg+0xcb7/0x1570 net/core/rtnetlink.c:5415
       netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45b349
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f88b3948c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f88b39496d4 RCX: 000000000045b349
      RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003
      RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 000000000000099f R14: 00000000004cb163 R15: 000000000075bfd4
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
       slab_alloc_node mm/slub.c:2774 [inline]
       __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4382
       __kmalloc_reserve net/core/skbuff.c:141 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209
       alloc_skb include/linux/skbuff.h:1049 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline]
       netlink_sendmsg+0x7d3/0x14d0 net/netlink/af_netlink.c:1892
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 6f96c3c6 ("net_sched: fix backward compatibility for TCA_KIND")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36d79af7
  20. 31 12月, 2019 1 次提交
    • D
      net/sched: add delete_empty() to filters and use it in cls_flower · a5b72a08
      Davide Caratti 提交于
      Revert "net/sched: cls_u32: fix refcount leak in the error path of
      u32_change()", and fix the u32 refcount leak in a more generic way that
      preserves the semantic of rule dumping.
      On tc filters that don't support lockless insertion/removal, there is no
      need to guard against concurrent insertion when a removal is in progress.
      Therefore, for most of them we can avoid a full walk() when deleting, and
      just decrease the refcount, like it was done on older Linux kernels.
      This fixes situations where walk() was wrongly detecting a non-empty
      filter, like it happened with cls_u32 in the error path of change(), thus
      leading to failures in the following tdc selftests:
      
       6aa7: (filter, u32) Add/Replace u32 with source match and invalid indev
       6658: (filter, u32) Add/Replace u32 with custom hash table and invalid handle
       74c2: (filter, u32) Add/Replace u32 filter with invalid hash table id
      
      On cls_flower, and on (future) lockless filters, this check is necessary:
      move all the check_empty() logic in a callback so that each filter
      can have its own implementation. For cls_flower, it's sufficient to check
      if no IDRs have been allocated.
      
      This reverts commit 275c44aa.
      
      Changes since v1:
       - document the need for delete_empty() when TCF_PROTO_OPS_DOIT_UNLOCKED
         is used, thanks to Vlad Buslov
       - implement delete_empty() without doing fl_walk(), thanks to Vlad Buslov
       - squash revert and new fix in a single patch, to be nice with bisect
         tests that run tdc on u32 filter, thanks to Dave Miller
      
      Fixes: 275c44aa ("net/sched: cls_u32: fix refcount leak in the error path of u32_change()")
      Fixes: 6676d5e4 ("net: sched: set dedicated tcf_walker flag when tp is empty")
      Suggested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Suggested-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
      Tested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5b72a08
  21. 08 12月, 2019 1 次提交
    • E
      net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add() · 2dd5616e
      Eric Dumazet 提交于
      Use the new tcf_proto_check_kind() helper to make sure user
      provided value is well formed.
      
      BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:606 [inline]
      BUG: KMSAN: uninit-value in string+0x4be/0x600 lib/vsprintf.c:668
      CPU: 0 PID: 12358 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
       __msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245
       string_nocheck lib/vsprintf.c:606 [inline]
       string+0x4be/0x600 lib/vsprintf.c:668
       vsnprintf+0x218f/0x3210 lib/vsprintf.c:2510
       __request_module+0x2b1/0x11c0 kernel/kmod.c:143
       tcf_proto_lookup_ops+0x171/0x700 net/sched/cls_api.c:139
       tc_chain_tmplt_add net/sched/cls_api.c:2730 [inline]
       tc_ctl_chain+0x1904/0x38a0 net/sched/cls_api.c:2850
       rtnetlink_rcv_msg+0x115a/0x1580 net/core/rtnetlink.c:5224
       netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5242
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x110f/0x1330 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg net/socket.c:657 [inline]
       ___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311
       __sys_sendmsg net/socket.c:2356 [inline]
       __do_sys_sendmsg net/socket.c:2365 [inline]
       __se_sys_sendmsg+0x305/0x460 net/socket.c:2363
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363
       do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45a649
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f0790795c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 000000000045a649
      RDX: 0000000000000000 RSI: 0000000020000300 RDI: 0000000000000006
      RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f07907966d4
      R13: 00000000004c8db5 R14: 00000000004df630 R15: 00000000ffffffff
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
       kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
       kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86
       slab_alloc_node mm/slub.c:2773 [inline]
       __kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381
       __kmalloc_reserve net/core/skbuff.c:141 [inline]
       __alloc_skb+0x306/0xa10 net/core/skbuff.c:209
       alloc_skb include/linux/skbuff.h:1049 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline]
       netlink_sendmsg+0x783/0x1330 net/netlink/af_netlink.c:1892
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg net/socket.c:657 [inline]
       ___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311
       __sys_sendmsg net/socket.c:2356 [inline]
       __do_sys_sendmsg net/socket.c:2365 [inline]
       __se_sys_sendmsg+0x305/0x460 net/socket.c:2363
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363
       do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 6f96c3c6 ("net_sched: fix backward compatibility for TCA_KIND")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2dd5616e
  22. 07 12月, 2019 2 次提交
    • J
      net: sched: allow indirect blocks to bind to clsact in TC · 25a443f7
      John Hurley 提交于
      When a device is bound to a clsact qdisc, bind events are triggered to
      registered drivers for both ingress and egress. However, if a driver
      registers to such a device using the indirect block routines then it is
      assumed that it is only interested in ingress offload and so only replays
      ingress bind/unbind messages.
      
      The NFP driver supports the offload of some egress filters when
      registering to a block with qdisc of type clsact. However, on unregister,
      if the block is still active, it will not receive an unbind egress
      notification which can prevent proper cleanup of other registered
      callbacks.
      
      Modify the indirect block callback command in TC to send messages of
      ingress and/or egress bind depending on the qdisc in use. NFP currently
      supports egress offload for TC flower offload so the changes are only
      added to TC.
      
      Fixes: 4d12ba42 ("nfp: flower: allow offloading of matches on 'internal' ports")
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25a443f7
    • J
      net: core: rename indirect block ingress cb function · dbad3408
      John Hurley 提交于
      With indirect blocks, a driver can register for callbacks from a device
      that is does not 'own', for example, a tunnel device. When registering to
      or unregistering from a new device, a callback is triggered to generate
      a bind/unbind event. This, in turn, allows the driver to receive any
      existing rules or to properly clean up installed rules.
      
      When first added, it was assumed that all indirect block registrations
      would be for ingress offloads. However, the NFP driver can, in some
      instances, support clsact qdisc binds for egress offload.
      
      Change the name of the indirect block callback command in flow_offload to
      remove the 'ingress' identifier from it. While this does not change
      functionality, a follow up patch will implement a more more generic
      callback than just those currently just supporting ingress offload.
      
      Fixes: 4d12ba42 ("nfp: flower: allow offloading of matches on 'internal' ports")
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dbad3408
  23. 06 11月, 2019 1 次提交
    • J
      net: sched: prevent duplicate flower rules from tcf_proto destroy race · 59eb87cb
      John Hurley 提交于
      When a new filter is added to cls_api, the function
      tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
      determine if the tcf_proto is duplicated in the chain's hashtable. It then
      creates a new entry or continues with an existing one. In cls_flower, this
      allows the function fl_ht_insert_unque to determine if a filter is a
      duplicate and reject appropriately, meaning that the duplicate will not be
      passed to drivers via the offload hooks. However, when a tcf_proto is
      destroyed it is removed from its chain before a hardware remove hook is
      hit. This can lead to a race whereby the driver has not received the
      remove message but duplicate flows can be accepted. This, in turn, can
      lead to the offload driver receiving incorrect duplicate flows and out of
      order add/delete messages.
      
      Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
      hash table per block stores each unique chain/protocol/prio being
      destroyed. This entry is only removed when the full destroy (and hardware
      offload) has completed. If a new flow is being added with the same
      identiers as a tc_proto being detroyed, then the add request is replayed
      until the destroy is complete.
      
      Fixes: 8b64678e ("net: sched: refactor tp insert/delete for concurrent execution")
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Reported-by: NLouis Peens <louis.peens@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59eb87cb
  24. 09 10月, 2019 1 次提交
  25. 24 9月, 2019 1 次提交
    • E
      net: sched: fix possible crash in tcf_action_destroy() · 3d66b89c
      Eric Dumazet 提交于
      If the allocation done in tcf_exts_init() failed,
      we end up with a NULL pointer in exts->actions.
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 8198 Comm: syz-executor.3 Not tainted 5.3.0-rc8+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:tcf_action_destroy+0x71/0x160 net/sched/act_api.c:705
      Code: c3 08 44 89 ee e8 4f cb bb fb 41 83 fd 20 0f 84 c9 00 00 00 e8 c0 c9 bb fb 48 89 d8 48 b9 00 00 00 00 00 fc ff df 48 c1 e8 03 <80> 3c 08 00 0f 85 c0 00 00 00 4c 8b 33 4d 85 f6 0f 84 9d 00 00 00
      RSP: 0018:ffff888096e16ff0 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: dffffc0000000000
      RDX: 0000000000040000 RSI: ffffffff85b6ab30 RDI: 0000000000000000
      RBP: ffff888096e17020 R08: ffff8880993f6140 R09: fffffbfff11cae67
      R10: fffffbfff11cae66 R11: ffffffff88e57333 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff888096e177a0 R15: 0000000000000001
      FS:  00007f62bc84a700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000758040 CR3: 0000000088b64000 CR4: 00000000001426e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       tcf_exts_destroy+0x38/0xb0 net/sched/cls_api.c:3030
       tcindex_set_parms+0xf7f/0x1e50 net/sched/cls_tcindex.c:488
       tcindex_change+0x230/0x318 net/sched/cls_tcindex.c:519
       tc_new_tfilter+0xa4b/0x1c70 net/sched/cls_api.c:2152
       rtnetlink_rcv_msg+0x838/0xb00 net/core/rtnetlink.c:5214
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:657
       ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
       __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
       __do_sys_sendmmsg net/socket.c:2442 [inline]
      
      Fixes: 90b73b77 ("net: sched: change action API to use array of pointers to actions")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Vlad Buslov <vladbu@mellanox.com>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d66b89c
  26. 16 9月, 2019 3 次提交
    • V
      net: sched: use get_dev() action API in flow_action infra · 470d5060
      Vlad Buslov 提交于
      When filling in hardware intermediate representation tc_setup_flow_action()
      directly obtains, checks and takes reference to dev used by mirred action,
      instead of using act->ops->get_dev() API created specifically for this
      purpose. In order to remove code duplication, refactor flow_action infra to
      use action API when obtaining mirred action target dev. Extend get_dev()
      with additional argument that is used to provide dev destructor to the
      user.
      
      Fixes: 5a6ff4b1 ("net: sched: take reference to action dev before calling offloads")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      470d5060
    • V
      net: sched: take reference to psample group in flow_action infra · 4a5da47d
      Vlad Buslov 提交于
      With recent patch set that removed rtnl lock dependency from cls hardware
      offload API rtnl lock is only taken when reading action data and can be
      released after action-specific data is parsed into intermediate
      representation. However, sample action psample group is passed by pointer
      without obtaining reference to it first, which makes it possible to
      concurrently overwrite the action and deallocate object pointed by
      psample_group pointer after rtnl lock is released but before driver
      finished using the pointer.
      
      To prevent such race condition, obtain reference to psample group while it
      is used by flow_action infra. Extend psample API with function
      psample_group_take() that increments psample group reference counter.
      Extend struct tc_action_ops with new get_psample_group() API. Implement the
      API for action sample using psample_group_take() and already existing
      psample_group_put() as a destructor. Use it in tc_setup_flow_action() to
      take reference to psample group pointed to by entry->sample.psample_group
      and release it in tc_cleanup_flow_action().
      
      Disable bh when taking psample_groups_lock. The lock is now taken while
      holding action tcf_lock that is used by data path and requires bh to be
      disabled, so doing the same for psample_groups_lock is necessary to
      preserve SOFTIRQ-irq-safety.
      
      Fixes: 918190f5 ("net: sched: flower: don't take rtnl lock for cls hw offloads API")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a5da47d
    • V
      net: sched: extend flow_action_entry with destructor · 1158958a
      Vlad Buslov 提交于
      Generalize flow_action_entry cleanup by extending the structure with
      pointer to destructor function. Set the destructor in
      tc_setup_flow_action(). Refactor tc_cleanup_flow_action() to call
      entry->destructor() instead of using switch that dispatches by entry->id
      and manually executes cleanup.
      
      This refactoring is necessary for following patches in this series that
      require destructor to use tc_action->ops callbacks that can't be easily
      obtained in tc_cleanup_flow_action().
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1158958a
  27. 06 9月, 2019 1 次提交
    • P
      net: openvswitch: Set OvS recirc_id from tc chain index · 95a7233c
      Paul Blakey 提交于
      Offloaded OvS datapath rules are translated one to one to tc rules,
      for example the following simplified OvS rule:
      
      recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)
      
      Will be translated to the following tc rule:
      
      $ tc filter add dev dev1 ingress \
      	    prio 1 chain 0 proto ip \
      		flower tcp ct_state -trk \
      		action ct pipe \
      		action goto chain 2
      
      Received packets will first travel though tc, and if they aren't stolen
      by it, like in the above rule, they will continue to OvS datapath.
      Since we already did some actions (action ct in this case) which might
      modify the packets, and updated action stats, we would like to continue
      the proccessing with the correct recirc_id in OvS (here recirc_id(2))
      where we left off.
      
      To support this, introduce a new skb extension for tc, which
      will be used for translating tc chain to ovs recirc_id to
      handle these miss cases. Last tc chain index will be set
      by tc goto chain action and read by OvS datapath.
      Signed-off-by: NPaul Blakey <paulb@mellanox.com>
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95a7233c
  28. 27 8月, 2019 3 次提交