1. 27 12月, 2019 3 次提交
    • V
      net: sched: verify that q!=NULL before setting q->flags · 603dbbd8
      Vlad Buslov 提交于
      commit 503d81d428bd598430f7f9d02021634e1a8139a0 upstream.
      
      In function int tc_new_tfilter() q pointer can be NULL when adding filter
      on a shared block. With recent change that resets TCQ_F_CAN_BYPASS after
      filter creation, following NULL pointer dereference happens in case parent
      block is shared:
      
      [  212.925060] BUG: kernel NULL pointer dereference, address: 0000000000000010
      [  212.925445] #PF: supervisor write access in kernel mode
      [  212.925709] #PF: error_code(0x0002) - not-present page
      [  212.925965] PGD 8000000827923067 P4D 8000000827923067 PUD 827924067 PMD 0
      [  212.926302] Oops: 0002 [#1] SMP KASAN PTI
      [  212.926539] CPU: 18 PID: 2617 Comm: tc Tainted: G    B             5.2.0+ #512
      [  212.926938] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [  212.927364] RIP: 0010:tc_new_tfilter+0x698/0xd40
      [  212.927633] Code: 74 0d 48 85 c0 74 08 48 89 ef e8 03 aa 62 00 48 8b 84 24 a0 00 00 00 48 8d 78 10 48 89 44 24 18 e8 4d 0c 6b ff 48 8b 44 24 18 <83> 60 10 f
      b 48 85 ed 0f 85 3d fe ff ff e9 4f fe ff ff e8 81 26 f8
      [  212.928607] RSP: 0018:ffff88884fd5f5d8 EFLAGS: 00010296
      [  212.928905] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dffffc0000000000
      [  212.929201] RDX: 0000000000000007 RSI: 0000000000000004 RDI: 0000000000000297
      [  212.929402] RBP: ffff88886bedd600 R08: ffffffffb91d4b51 R09: fffffbfff7616e4d
      [  212.929609] R10: fffffbfff7616e4c R11: ffffffffbb0b7263 R12: ffff88886bc61040
      [  212.929803] R13: ffff88884fd5f950 R14: ffffc900039c5000 R15: ffff88835e927680
      [  212.929999] FS:  00007fe7c50b6480(0000) GS:ffff88886f980000(0000) knlGS:0000000000000000
      [  212.930235] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  212.930394] CR2: 0000000000000010 CR3: 000000085bd04002 CR4: 00000000001606e0
      [  212.930588] Call Trace:
      [  212.930682]  ? tc_del_tfilter+0xa40/0xa40
      [  212.930811]  ? __lock_acquire+0x5b5/0x2460
      [  212.930948]  ? find_held_lock+0x85/0xa0
      [  212.931081]  ? tc_del_tfilter+0xa40/0xa40
      [  212.931201]  rtnetlink_rcv_msg+0x4ab/0x5f0
      [  212.931332]  ? rtnl_dellink+0x490/0x490
      [  212.931454]  ? lockdep_hardirqs_on+0x260/0x260
      [  212.931589]  ? netlink_deliver_tap+0xab/0x5a0
      [  212.931717]  ? match_held_lock+0x1b/0x240
      [  212.931844]  netlink_rcv_skb+0xd0/0x200
      [  212.931958]  ? rtnl_dellink+0x490/0x490
      [  212.932079]  ? netlink_ack+0x440/0x440
      [  212.932205]  ? netlink_deliver_tap+0x161/0x5a0
      [  212.932335]  ? lock_downgrade+0x360/0x360
      [  212.932457]  ? lock_acquire+0xe5/0x210
      [  212.932579]  netlink_unicast+0x296/0x350
      [  212.932705]  ? netlink_attachskb+0x390/0x390
      [  212.932834]  ? _copy_from_iter_full+0xe0/0x3a0
      [  212.932976]  netlink_sendmsg+0x394/0x600
      [  212.937998]  ? netlink_unicast+0x350/0x350
      [  212.943033]  ? move_addr_to_kernel.part.0+0x90/0x90
      [  212.948115]  ? netlink_unicast+0x350/0x350
      [  212.953185]  sock_sendmsg+0x96/0xa0
      [  212.958099]  ___sys_sendmsg+0x482/0x520
      [  212.962881]  ? match_held_lock+0x1b/0x240
      [  212.967618]  ? copy_msghdr_from_user+0x250/0x250
      [  212.972337]  ? lock_downgrade+0x360/0x360
      [  212.976973]  ? rwlock_bug.part.0+0x60/0x60
      [  212.981548]  ? __mod_node_page_state+0x1f/0xa0
      [  212.986060]  ? match_held_lock+0x1b/0x240
      [  212.990567]  ? find_held_lock+0x85/0xa0
      [  212.994989]  ? do_user_addr_fault+0x349/0x5b0
      [  212.999387]  ? lock_downgrade+0x360/0x360
      [  213.003713]  ? find_held_lock+0x85/0xa0
      [  213.007972]  ? __fget_light+0xa1/0xf0
      [  213.012143]  ? sockfd_lookup_light+0x91/0xb0
      [  213.016165]  __sys_sendmsg+0xba/0x130
      [  213.020040]  ? __sys_sendmsg_sock+0xb0/0xb0
      [  213.023870]  ? handle_mm_fault+0x337/0x470
      [  213.027592]  ? page_fault+0x8/0x30
      [  213.031316]  ? lockdep_hardirqs_off+0xbe/0x100
      [  213.034999]  ? mark_held_locks+0x24/0x90
      [  213.038671]  ? do_syscall_64+0x1e/0xe0
      [  213.042297]  do_syscall_64+0x74/0xe0
      [  213.045828]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  213.049354] RIP: 0033:0x7fe7c527c7b8
      [  213.052792] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f
      0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 54
      [  213.060269] RSP: 002b:00007ffc3f7908a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  213.064144] RAX: ffffffffffffffda RBX: 000000005d34716f RCX: 00007fe7c527c7b8
      [  213.068094] RDX: 0000000000000000 RSI: 00007ffc3f790910 RDI: 0000000000000003
      [  213.072109] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fe7c5340cc0
      [  213.076113] R10: 0000000000404ec2 R11: 0000000000000246 R12: 0000000000000080
      [  213.080146] R13: 0000000000480640 R14: 0000000000000080 R15: 0000000000000000
      [  213.084147] Modules linked in: act_gact cls_flower sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc intel_rapl_msr intel_rapl_common
      [<1;69;32Msb_edac rdma_ucm rdma_cm x86_pkg_temp_thermal iw_cm intel_powerclamp ib_cm coretemp kvm_intel kvm irqbypass mlx5_ib ib_uverbs ib_core crct10dif_pclmul crc32_pc
      lmul crc32c_intel ghash_clmulni_intel mlx5_core intel_cstate intel_uncore iTCO_wdt igb iTCO_vendor_support mlxfw mei_me ptp ses intel_rapl_perf mei pcspkr ipmi
      _ssif i2c_i801 joydev enclosure pps_core lpc_ich ioatdma wmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_vram_helpe
      r ttm drm_kms_helper drm mpt3sas raid_class scsi_transport_sas
      [  213.112326] CR2: 0000000000000010
      [  213.117429] ---[ end trace adb58eb0a4ee6283 ]---
      
      Verify that q pointer is not NULL before setting the 'flags' field.
      
      Fixes: 3f05e6886a59 ("net_sched: unset TCQ_F_CAN_BYPASS when adding filters")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Sasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      603dbbd8
    • C
      net_sched: unset TCQ_F_CAN_BYPASS when adding filters · b8620edc
      Cong Wang 提交于
      [ Upstream commit 3f05e6886a595c9a29a309c52f45326be917823c ]
      
      For qdisc's that support TC filters and set TCQ_F_CAN_BYPASS,
      notably fq_codel, it makes no sense to let packets bypass the TC
      filters we setup in any scenario, otherwise our packets steering
      policy could not be enforced.
      
      This can be reproduced easily with the following script:
      
       ip li add dev dummy0 type dummy
       ifconfig dummy0 up
       tc qd add dev dummy0 root fq_codel
       tc filter add dev dummy0 parent 8001: protocol arp basic action mirred egress redirect dev lo
       tc filter add dev dummy0 parent 8001: protocol ip basic action mirred egress redirect dev lo
       ping -I dummy0 192.168.112.1
      
      Without this patch, packets are sent directly to dummy0 without
      hitting any of the filters. With this patch, packets are redirected
      to loopback as expected.
      
      This fix is not perfect, it only unsets the flag but does not set it back
      because we have to save the information somewhere in the qdisc if we
      really want that. Note, both fq_codel and sfq clear this flag in their
      ->bind_tcf() but this is clearly not sufficient when we don't use any
      class ID.
      
      Fixes: 23624935 ("net_sched: TCQ_F_CAN_BYPASS generalization")
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      b8620edc
    • C
      net_sched: refetch skb protocol for each filter · 805ba04f
      Cong Wang 提交于
      [ Upstream commit cd0c4e70fc0ccfa705cdf55efb27519ce9337a26 ]
      
      Martin reported a set of filters don't work after changing
      from reclassify to continue. Looking into the code, it
      looks like skb protocol is not always fetched for each
      iteration of the filters. But, as demonstrated by Martin,
      TC actions could modify skb->protocol, for example act_vlan,
      this means we have to refetch skb protocol in each iteration,
      rather than using the one we fetch in the beginning of the loop.
      
      This bug is _not_ introduced by commit 3b3ae880
      ("net: sched: consolidate tc_classify{,_compat}"), technically,
      if act_vlan is the only action that modifies skb protocol, then
      it is commit c7e2b968 ("sched: introduce vlan action") which
      introduced this bug.
      Reported-by: NMartin Olsson <martin.olsson+netdev@sentorsecurity.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      805ba04f
  2. 16 10月, 2018 1 次提交
    • D
      net/sched: cls_api: add missing validation of netlink attributes · e331473f
      Davide Caratti 提交于
      Similarly to what has been done in 8b4c3cdd ("net: sched: Add policy
      validation for tc attributes"), fix classifier code to add validation of
      TCA_CHAIN and TCA_KIND netlink attributes.
      
      tested with:
       # ./tdc.py -c filter
      
      v2: Let sch_api and cls_api share nla_policy they have in common, thanks
          to David Ahern.
      v3: Avoid EXPORT_SYMBOL(), as validation of those attributes is not done
          by TC modules, thanks to Cong Wang.
          While at it, restore the 'Delete / get qdisc' comment to its orginal
          position, just above tc_get_qdisc() function prototype.
      
      Fixes: 5bc17018 ("net: sched: introduce multichain support for filters")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e331473f
  3. 14 9月, 2018 1 次提交
  4. 28 8月, 2018 2 次提交
  5. 12 8月, 2018 1 次提交
  6. 10 8月, 2018 1 次提交
  7. 04 8月, 2018 1 次提交
  8. 02 8月, 2018 3 次提交
  9. 28 7月, 2018 1 次提交
  10. 27 7月, 2018 2 次提交
  11. 24 7月, 2018 4 次提交
  12. 22 7月, 2018 1 次提交
  13. 19 7月, 2018 1 次提交
  14. 14 7月, 2018 1 次提交
  15. 08 7月, 2018 3 次提交
  16. 26 6月, 2018 2 次提交
  17. 07 6月, 2018 1 次提交
  18. 05 6月, 2018 2 次提交
  19. 01 6月, 2018 1 次提交
    • V
      net: sched: split tc_ctl_tfilter into three handlers · c431f89b
      Vlad Buslov 提交于
      tc_ctl_tfilter handles three netlink message types: RTM_NEWTFILTER,
      RTM_DELTFILTER, RTM_GETTFILTER. However, implementation of this function
      involves a lot of branching on specific message type because most of the
      code is message-specific. This significantly complicates adding new
      functionality and doesn't provide much benefit of code reuse.
      
      Split tc_ctl_tfilter to three standalone functions that handle filter new,
      delete and get requests.
      
      The only truly protocol independent part of tc_ctl_tfilter is code that
      looks up queue, class, and block. Refactor this code to standalone
      tcf_block_find function that is used by all three new handlers.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c431f89b
  20. 25 5月, 2018 2 次提交
  21. 12 5月, 2018 1 次提交
  22. 28 3月, 2018 1 次提交
  23. 10 3月, 2018 1 次提交
  24. 28 2月, 2018 1 次提交
  25. 21 2月, 2018 1 次提交
    • R
      net: sched: report if filter is too large to dump · 5ae437ad
      Roman Kapl 提交于
      So far, if the filter was too large to fit in the allocated skb, the
      kernel did not return any error and stopped dumping. Modify the dumper
      so that it returns -EMSGSIZE when a filter fails to dump and it is the
      first filter in the skb. If we are not first, we will get a next chance
      with more room.
      
      I understand this is pretty near to being an API change, but the
      original design (silent truncation) can be considered a bug.
      
      Note: The error case can happen pretty easily if you create a filter
      with 32 actions and have 4kb pages. Also recent versions of iproute try
      to be clever with their buffer allocation size, which in turn leads to
      Signed-off-by: NRoman Kapl <code@rkapl.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ae437ad
  26. 17 2月, 2018 1 次提交