1. 09 4月, 2021 2 次提交
  2. 31 3月, 2021 1 次提交
    • K
      net: sched: bump refcount for new action in ACT replace mode · 6855e821
      Kumar Kartikeya Dwivedi 提交于
      Currently, action creation using ACT API in replace mode is buggy.
      When invoking for non-existent action index 42,
      
      	tc action replace action bpf obj foo.o sec <xyz> index 42
      
      kernel creates the action, fills up the netlink response, and then just
      deletes the action after notifying userspace.
      
      	tc action show action bpf
      
      doesn't list the action.
      
      This happens due to the following sequence when ovr = 1 (replace mode)
      is enabled:
      
      tcf_idr_check_alloc is used to atomically check and either obtain
      reference for existing action at index, or reserve the index slot using
      a dummy entry (ERR_PTR(-EBUSY)).
      
      This is necessary as pointers to these actions will be held after
      dropping the idrinfo lock, so bumping the reference count is necessary
      as we need to insert the actions, and notify userspace by dumping their
      attributes. Finally, we drop the reference we took using the
      tcf_action_put_many call in tcf_action_add. However, for the case where
      a new action is created due to free index, its refcount remains one.
      This when paired with the put_many call leads to the kernel setting up
      the action, notifying userspace of its creation, and then tearing it
      down. For existing actions, the refcount is still held so they remain
      unaffected.
      
      Fortunately due to rtnl_lock serialization requirement, such an action
      with refcount == 1 will not be concurrently deleted by anything else, at
      best CLS API can move its refcount up and down by binding to it after it
      has been published from tcf_idr_insert_many. Since refcount is atleast
      one until put_many call, CLS API cannot delete it. Also __tcf_action_put
      release path already ensures deterministic outcome (either new action
      will be created or existing action will be reused in case CLS API tries
      to bind to action concurrently) due to idr lock serialization.
      
      We fix this by making refcount of newly created actions as 2 in ACT API
      replace mode. A relaxed store will suffice as visibility is ensured only
      after the tcf_idr_insert_many call.
      
      Note that in case of creation or overwriting using CLS API only (i.e.
      bind = 1), overwriting existing action object is not allowed, and any
      such request is silently ignored (without error).
      
      The refcount bump that occurs in tcf_idr_check_alloc call there for
      existing action will pair with tcf_exts_destroy call made from the
      owner module for the same action. In case of action creation, there
      is no existing action, so no tcf_exts_destroy callback happens.
      
      This means no code changes for CLS API.
      
      Fixes: cae422f3 ("net: sched: use reference counting action init")
      Signed-off-by: NKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6855e821
  3. 17 2月, 2021 1 次提交
    • V
      net: sched: fix police ext initialization · 396d7f23
      Vlad Buslov 提交于
      When police action is created by cls API tcf_exts_validate() first
      conditional that calls tcf_action_init_1() directly, the action idr is not
      updated according to latest changes in action API that require caller to
      commit newly created action to idr with tcf_idr_insert_many(). This results
      such action not being accessible through act API and causes crash reported
      by syzbot:
      
      ==================================================================
      BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline]
      BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
      BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline]
      BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
      Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204
      
      CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       __kasan_report mm/kasan/report.c:400 [inline]
       kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413
       check_memory_region_inline mm/kasan/generic.c:179 [inline]
       check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
       instrument_atomic_read include/linux/instrumented.h:71 [inline]
       atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
       __tcf_idr_release net/sched/act_api.c:178 [inline]
       tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
       tc_action_net_exit include/net/act_api.h:151 [inline]
       police_exit_net+0x168/0x360 net/sched/act_police.c:390
       ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
       cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
       process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
       worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
       kthread+0x3b1/0x4a0 kernel/kthread.c:292
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
      ==================================================================
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G    B             5.11.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       panic+0x306/0x73d kernel/panic.c:231
       end_report+0x58/0x5e mm/kasan/report.c:100
       __kasan_report mm/kasan/report.c:403 [inline]
       kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413
       check_memory_region_inline mm/kasan/generic.c:179 [inline]
       check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
       instrument_atomic_read include/linux/instrumented.h:71 [inline]
       atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
       __tcf_idr_release net/sched/act_api.c:178 [inline]
       tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
       tc_action_net_exit include/net/act_api.h:151 [inline]
       police_exit_net+0x168/0x360 net/sched/act_police.c:390
       ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
       cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
       process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
       worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
       kthread+0x3b1/0x4a0 kernel/kthread.c:292
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
      Kernel Offset: disabled
      
      Fix the issue by calling tcf_idr_insert_many() after successful action
      initialization.
      
      Fixes: 0fedc63f ("net_sched: commit action insertions together")
      Reported-by: syzbot+151e3e714d34ae4ce7e8@syzkaller.appspotmail.com
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      396d7f23
  4. 19 1月, 2021 1 次提交
    • C
      net_sched: fix RTNL deadlock again caused by request_module() · d349f997
      Cong Wang 提交于
      tcf_action_init_1() loads tc action modules automatically with
      request_module() after parsing the tc action names, and it drops RTNL
      lock and re-holds it before and after request_module(). This causes a
      lot of troubles, as discovered by syzbot, because we can be in the
      middle of batch initializations when we create an array of tc actions.
      
      One of the problem is deadlock:
      
      CPU 0					CPU 1
      rtnl_lock();
      for (...) {
        tcf_action_init_1();
          -> rtnl_unlock();
          -> request_module();
      				rtnl_lock();
      				for (...) {
      				  tcf_action_init_1();
      				    -> tcf_idr_check_alloc();
      				   // Insert one action into idr,
      				   // but it is not committed until
      				   // tcf_idr_insert_many(), then drop
      				   // the RTNL lock in the _next_
      				   // iteration
      				   -> rtnl_unlock();
          -> rtnl_lock();
          -> a_o->init();
            -> tcf_idr_check_alloc();
            // Now waiting for the same index
            // to be committed
      				    -> request_module();
      				    -> rtnl_lock()
      				    // Now waiting for RTNL lock
      				}
      				rtnl_unlock();
      }
      rtnl_unlock();
      
      This is not easy to solve, we can move the request_module() before
      this loop and pre-load all the modules we need for this netlink
      message and then do the rest initializations. So the loop breaks down
      to two now:
      
              for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
                      struct tc_action_ops *a_o;
      
                      a_o = tc_action_load_ops(name, tb[i]...);
                      ops[i - 1] = a_o;
              }
      
              for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
                      act = tcf_action_init_1(ops[i - 1]...);
              }
      
      Although this looks serious, it only has been reported by syzbot, so it
      seems hard to trigger this by humans. And given the size of this patch,
      I'd suggest to make it to net-next and not to backport to stable.
      
      This patch has been tested by syzbot and tested with tdc.py by me.
      
      Fixes: 0fedc63f ("net_sched: commit action insertions together")
      Reported-and-tested-by: syzbot+82752bc5331601cf4899@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+b3b63b6bff456bd95294@syzkaller.appspotmail.com
      Reported-by: syzbot+ba67b12b1ca729912834@syzkaller.appspotmail.com
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <cong.wang@bytedance.com>
      Tested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20210117005657.14810-1-xiyou.wangcong@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      d349f997
  5. 28 11月, 2020 1 次提交
  6. 26 11月, 2020 1 次提交
  7. 17 11月, 2020 2 次提交
  8. 11 11月, 2020 1 次提交
  9. 06 11月, 2020 1 次提交
  10. 05 10月, 2020 1 次提交
    • C
      net_sched: check error pointer in tcf_dump_walker() · 580e4273
      Cong Wang 提交于
      Although we take RTNL on dump path, it is possible to
      skip RTNL on insertion path. So the following race condition
      is possible:
      
      rtnl_lock()		// no rtnl lock
      			mutex_lock(&idrinfo->lock);
      			// insert ERR_PTR(-EBUSY)
      			mutex_unlock(&idrinfo->lock);
      tc_dump_action()
      rtnl_unlock()
      
      So we have to skip those temporary -EBUSY entries on dump path
      too.
      
      Reported-and-tested-by: syzbot+b47bc4f247856fb4d9e1@syzkaller.appspotmail.com
      Fixes: 0fedc63f ("net_sched: commit action insertions together")
      Cc: Vlad Buslov <vladbu@mellanox.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      580e4273
  11. 29 9月, 2020 1 次提交
  12. 25 9月, 2020 2 次提交
    • C
      net_sched: commit action insertions together · 0fedc63f
      Cong Wang 提交于
      syzbot is able to trigger a failure case inside the loop in
      tcf_action_init(), and when this happens we clean up with
      tcf_action_destroy(). But, as these actions are already inserted
      into the global IDR, other parallel process could free them
      before tcf_action_destroy(), then we will trigger a use-after-free.
      
      Fix this by deferring the insertions even later, after the loop,
      and committing all the insertions in a separate loop, so we will
      never fail in the middle of the insertions any more.
      
      One side effect is that the window between alloction and final
      insertion becomes larger, now it is more likely that the loop in
      tcf_del_walker() sees the placeholder -EBUSY pointer. So we have
      to check for error pointer in tcf_del_walker().
      
      Reported-and-tested-by: syzbot+2287853d392e4b42374a@syzkaller.appspotmail.com
      Fixes: 0190c1d4 ("net: sched: atomically check-allocate action")
      Cc: Vlad Buslov <vladbu@mellanox.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fedc63f
    • C
      net_sched: defer tcf_idr_insert() in tcf_action_init_1() · e49d8c22
      Cong Wang 提交于
      All TC actions call tcf_idr_insert() for new action at the end
      of their ->init(), so we can actually move it to a central place
      in tcf_action_init_1().
      
      And once the action is inserted into the global IDR, other parallel
      process could free it immediately as its refcnt is still 1, so we can
      not fail after this, we need to move it after the goto action
      validation to avoid handling the failure case after insertion.
      
      This is found during code review, is not directly triggered by syzbot.
      And this prepares for the next patch.
      
      Cc: Vlad Buslov <vladbu@mellanox.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e49d8c22
  13. 09 9月, 2020 1 次提交
  14. 21 6月, 2020 1 次提交
  15. 20 6月, 2020 1 次提交
  16. 16 5月, 2020 1 次提交
  17. 01 5月, 2020 1 次提交
  18. 31 3月, 2020 2 次提交
  19. 24 3月, 2020 1 次提交
  20. 09 3月, 2020 1 次提交
  21. 27 2月, 2020 1 次提交
  22. 13 11月, 2019 1 次提交
  23. 06 11月, 2019 1 次提交
  24. 31 10月, 2019 4 次提交
  25. 30 10月, 2019 1 次提交
  26. 16 10月, 2019 1 次提交
    • E
      net: avoid potential infinite loop in tc_ctl_action() · 39f13ea2
      Eric Dumazet 提交于
      tc_ctl_action() has the ability to loop forever if tcf_action_add()
      returns -EAGAIN.
      
      This special case has been done in case a module needed to be loaded,
      but it turns out that tcf_add_notify() could also return -EAGAIN
      if the socket sk_rcvbuf limit is hit.
      
      We need to separate the two cases, and only loop for the module
      loading case.
      
      While we are at it, add a limit of 10 attempts since unbounded
      loops are always scary.
      
      syzbot repro was something like :
      
      socket(PF_NETLINK, SOCK_RAW|SOCK_NONBLOCK, NETLINK_ROUTE) = 3
      write(3, ..., 38) = 38
      setsockopt(3, SOL_SOCKET, SO_RCVBUF, [0], 4) = 0
      sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{..., 388}], msg_controllen=0, msg_flags=0x10}, ...)
      
      NMI backtrace for cpu 0
      CPU: 0 PID: 1054 Comm: khungtaskd Not tainted 5.4.0-rc1+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
       nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
       arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
       trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
       check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
       watchdog+0x9d0/0xef0 kernel/hung_task.c:289
       kthread+0x361/0x430 kernel/kthread.c:255
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
      Sending NMI from CPU 0 to CPUs 1:
      NMI backtrace for cpu 1
      CPU: 1 PID: 8859 Comm: syz-executor910 Not tainted 5.4.0-rc1+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:arch_local_save_flags arch/x86/include/asm/paravirt.h:751 [inline]
      RIP: 0010:lockdep_hardirqs_off+0x1df/0x2e0 kernel/locking/lockdep.c:3453
      Code: 5c 08 00 00 5b 41 5c 41 5d 5d c3 48 c7 c0 58 1d f3 88 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 d3 00 00 00 <48> 83 3d 21 9e 99 07 00 0f 84 b9 00 00 00 9c 58 0f 1f 44 00 00 f6
      RSP: 0018:ffff8880a6f3f1b8 EFLAGS: 00000046
      RAX: 1ffffffff11e63ab RBX: ffff88808c9c6080 RCX: 0000000000000000
      RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff88808c9c6914
      RBP: ffff8880a6f3f1d0 R08: ffff88808c9c6080 R09: fffffbfff16be5d1
      R10: fffffbfff16be5d0 R11: 0000000000000003 R12: ffffffff8746591f
      R13: ffff88808c9c6080 R14: ffffffff8746591f R15: 0000000000000003
      FS:  00000000011e4880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffff600400 CR3: 00000000a8920000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       trace_hardirqs_off+0x62/0x240 kernel/trace/trace_preemptirq.c:45
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
       _raw_spin_lock_irqsave+0x6f/0xcd kernel/locking/spinlock.c:159
       __wake_up_common_lock+0xc8/0x150 kernel/sched/wait.c:122
       __wake_up+0xe/0x10 kernel/sched/wait.c:142
       netlink_unlock_table net/netlink/af_netlink.c:466 [inline]
       netlink_unlock_table net/netlink/af_netlink.c:463 [inline]
       netlink_broadcast_filtered+0x705/0xb80 net/netlink/af_netlink.c:1514
       netlink_broadcast+0x3a/0x50 net/netlink/af_netlink.c:1534
       rtnetlink_send+0xdd/0x110 net/core/rtnetlink.c:714
       tcf_add_notify net/sched/act_api.c:1343 [inline]
       tcf_action_add+0x243/0x370 net/sched/act_api.c:1362
       tc_ctl_action+0x3b5/0x4bc net/sched/act_api.c:1410
       rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5386
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5404
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:657
       ___sys_sendmsg+0x803/0x920 net/socket.c:2311
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
       __do_sys_sendmsg net/socket.c:2365 [inline]
       __se_sys_sendmsg net/socket.c:2363 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
       do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x440939
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: syzbot+cf0adbb9c28c8866c788@syzkaller.appspotmail.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39f13ea2
  27. 09 10月, 2019 1 次提交
  28. 22 9月, 2019 1 次提交
  29. 02 7月, 2019 1 次提交
    • C
      idr: fix overflow case for idr_for_each_entry_ul() · e33d2b74
      Cong Wang 提交于
      idr_for_each_entry_ul() is buggy as it can't handle overflow
      case correctly. When we have an ID == UINT_MAX, it becomes an
      infinite loop. This happens when running on 32-bit CPU where
      unsigned long has the same size with unsigned int.
      
      There is no better way to fix this than casting it to a larger
      integer, but we can't just 64 bit integer on 32 bit CPU. Instead
      we could just use an additional integer to help us to detect this
      overflow case, that is, adding a new parameter to this macro.
      Fortunately tc action is its only user right now.
      
      Fixes: 65a206c0 ("net/sched: Change act_api and act_xxx modules to use IDR")
      Reported-by: NLi Shuang <shuali@redhat.com>
      Tested-by: NDavide Caratti <dcaratti@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chris Mi <chrism@mellanox.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e33d2b74
  30. 31 5月, 2019 1 次提交
  31. 25 5月, 2019 1 次提交
    • V
      net: sched: don't use tc_action->order during action dump · 4097e9d2
      Vlad Buslov 提交于
      Function tcf_action_dump() relies on tc_action->order field when starting
      nested nla to send action data to userspace. This approach breaks in
      several cases:
      
      - When multiple filters point to same shared action, tc_action->order field
        is overwritten each time it is attached to filter. This causes filter
        dump to output action with incorrect attribute for all filters that have
        the action in different position (different order) from the last set
        tc_action->order value.
      
      - When action data is displayed using tc action API (RTM_GETACTION), action
        order is overwritten by tca_action_gd() according to its position in
        resulting array of nl attributes, which will break filter dump for all
        filters attached to that shared action that expect it to have different
        order value.
      
      Don't rely on tc_action->order when dumping actions. Set nla according to
      action position in resulting array of actions instead.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4097e9d2
  32. 28 4月, 2019 2 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de