- 07 2月, 2018 3 次提交
-
-
由 Matthew Wilcox 提交于
Use the new helper. Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
-
由 Matthew Wilcox 提交于
Simply changing idr_remove's 'id' argument to 'unsigned long' works for all callers. Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
-
由 Matthew Wilcox 提交于
Simply changing idr_remove's 'id' argument to 'unsigned long' suffices for all callers. Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
-
- 25 1月, 2018 1 次提交
-
-
由 Jakub Kicinski 提交于
Propagate extack to cls->destroy callbacks when called from non-error paths. On error paths pass NULL to avoid overwriting the failure message. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NSimon Horman <simon.horman@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 1月, 2018 4 次提交
-
-
由 Alexander Aring 提交于
This patch adds extack support for classifier delete callback api. This prepares to handle extack support inside each specific classifier implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
The tcf_exts_validate function calls the act api change callback. For preparing extack support for act api, this patch adds the extack as parameter for this function which is common used in cls implementations. Furthermore the tcf_exts_validate will call action init callback which prepares the TC action subsystem for extack support. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for classifier change callback api. This prepares to handle extack support inside each specific classifier implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for generic cls handling. The extack will be set deeper to each called function which is not part of netdev core api. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 1月, 2018 8 次提交
-
-
由 Jiri Pirko 提交于
When tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK, parent is still passed down but the value is never used. Compiler does not recognize it and issues a warning. Silence it down initializing parent to 0. Fixes: 7960d1da ("net: sched: use block index as a handle instead of qdisc when block is shared") Reported-by: NDavid Miller <davem@davemloft.net> Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
As the tcm_ifindex with value TCM_IFINDEX_MAGIC_BLOCK is invalid ifindex, use it to indicate that we work with block, instead of qdisc. So if tcm_ifindex is set to TCM_IFINDEX_MAGIC_BLOCK, tcm_parent is used to carry block_index. If the block is set to be shared between at least 2 qdiscs, it is forbidden to use the qdisc handle to add/delete filters. In that case, userspace has to pass block_index. Also, for dump of the filters, in case the block is shared in between at least 2 qdiscs, the each filter is dumped with tcm_ifindex value TCM_IFINDEX_MAGIC_BLOCK and tcm_parent set to block_index. That gives the user clear indication, that the filter belongs to a shared block and not only to one qdisc under which it is dumped. Suggested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
During block bind, we need to check tc offload feature. If it is disabled yet still the block contains offloaded filters, forbid the bind. Also forbid to register callback for a block that already contains offloaded filters, as the play back is not supported now. For keeping track of offloaded filters there is a new counter introduced, alongside with couple of helpers called from cls_* code. These helpers set and clear TCA_CLS_FLAGS_IN_HW flag. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Both are no longer used, so remove them. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Couple of classifiers call netif_keep_dst directly on q->dev. That is not possible to do directly for shared blocke where multiple qdiscs are owning the block. So introduce a infrastructure to keep track of the block owners in list and use this list to implement block variant of netif_keep_dst. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Use block index in the messages instead. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Allow qdiscs to share filter blocks among them. Each qdisc type has to use block get/put extended modifications that enable sharing. Shared blocks are tracked within each net namespace and identified by u32 index. This index is passed from user during the qdisc creation. If user passes index that is not used by any other qdisc, new block is created. If user passes index that is already used, the existing block will be re-used. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
So far, there was possible only to register a single filter chain pointer to block->chain[0]. However, when the blocks will get shareable, we need to allow multiple filter chain pointers registration. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 12月, 2017 1 次提交
-
-
由 Jiri Pirko 提交于
We need to check block for being null in both tcf_block_put and tcf_block_put_ext. Fixes: 343723dd ("net: sched: fix clsact init error path") Reported-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 12月, 2017 2 次提交
-
-
由 Alexander Aring 提交于
This patch adds extack support for the function tcf_block_get which is a common used function in the tc subsystem. Callers which are interested in the receiving error can assign extack to get a more detailed information why tcf_block_get failed. Cc: David Ahern <dsahern@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for block callback to prepare per-qdisc specific changes for extack. Cc: David Ahern <dsahern@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 12月, 2017 1 次提交
-
-
由 Jiri Pirko 提交于
Since in qdisc_create, the destroy op is called when init fails, we don't do cleanup in init and leave it up to destroy. This fixes use-after-free when trying to put already freed block. Fixes: 6e40cf2d ("net: sched: use extended variants of block_get/put in ingress and clsact qdiscs") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 12月, 2017 1 次提交
-
-
由 Pravin Shedge 提交于
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: NPravin Shedge <pravin.shedge4linux@gmail.com> Acked-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 12月, 2017 1 次提交
-
-
由 Jiri Pirko 提交于
Since the block is freed with last chain being put, once we reach the end of iteration of list_for_each_entry_safe, the block may be already freed. I'm hitting this only by creating and deleting clsact: [ 202.171952] ================================================================== [ 202.180182] BUG: KASAN: use-after-free in tcf_block_put_ext+0x240/0x390 [ 202.187590] Read of size 8 at addr ffff880225539a80 by task tc/796 [ 202.194508] [ 202.196185] CPU: 0 PID: 796 Comm: tc Not tainted 4.15.0-rc2jiri+ #5 [ 202.203200] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016 [ 202.213613] Call Trace: [ 202.216369] dump_stack+0xda/0x169 [ 202.220192] ? dma_virt_map_sg+0x147/0x147 [ 202.224790] ? show_regs_print_info+0x54/0x54 [ 202.229691] ? tcf_chain_destroy+0x1dc/0x250 [ 202.234494] print_address_description+0x83/0x3d0 [ 202.239781] ? tcf_block_put_ext+0x240/0x390 [ 202.244575] kasan_report+0x1ba/0x460 [ 202.248707] ? tcf_block_put_ext+0x240/0x390 [ 202.253518] tcf_block_put_ext+0x240/0x390 [ 202.258117] ? tcf_chain_flush+0x290/0x290 [ 202.262708] ? qdisc_hash_del+0x82/0x1a0 [ 202.267111] ? qdisc_hash_add+0x50/0x50 [ 202.271411] ? __lock_is_held+0x5f/0x1a0 [ 202.275843] clsact_destroy+0x3d/0x80 [sch_ingress] [ 202.281323] qdisc_destroy+0xcb/0x240 [ 202.285445] qdisc_graft+0x216/0x7b0 [ 202.289497] tc_get_qdisc+0x260/0x560 Fix this by holding the block also by chain 0 and put chain 0 explicitly, out of the list_for_each_entry_safe loop at the very end of tcf_block_put_ext. Fixes: efbf7897 ("net_sched: get rid of rcu_barrier() in tcf_block_put_ext()") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 12月, 2017 1 次提交
-
-
由 Cong Wang 提交于
Both Eric and Paolo noticed the rcu_barrier() we use in tcf_block_put_ext() could be a performance bottleneck when we have a lot of tc classes. Paolo provided the following to demonstrate the issue: tc qdisc add dev lo root htb for I in `seq 1 1000`; do tc class add dev lo parent 1: classid 1:$I htb rate 100kbit tc qdisc add dev lo parent 1:$I handle $((I + 1)): htb for J in `seq 1 10`; do tc filter add dev lo parent $((I + 1)): u32 match ip src 1.1.1.$J done done time tc qdisc del dev root real 0m54.764s user 0m0.023s sys 0m0.000s The rcu_barrier() there is to ensure we free the block after all chains are gone, that is, to queue tcf_block_put_final() at the tail of workqueue. We can achieve this ordering requirement by refcnt'ing tcf block instead, that is, the tcf block is freed only when the last chain in this block is gone. This also simplifies the code. Paolo reported after this patch we get: real 0m0.017s user 0m0.000s sys 0m0.017s Tested-by: NPaolo Abeni <pabeni@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 11月, 2017 1 次提交
-
-
由 Roman Kapl 提交于
tcf_block_put_ext has assumed that all filters (and thus their goto actions) are destroyed in RCU callback and thus can not race with our list iteration. However, that is not true during netns cleanup (see tcf_exts_get_net comment). Prevent the user after free by holding all chains (except 0, that one is already held). foreach_safe is not enough in this case. To reproduce, run the following in a netns and then delete the ns: ip link add dtest type dummy tc qdisc add dev dtest ingress tc filter add dev dtest chain 1 parent ffff: handle 1 prio 1 flower action goto chain 2 Fixes: 822e86d9 ("net_sched: remove tcf_block_put_deferred()") Signed-off-by: NRoman Kapl <code@rkapl.cz> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 11月, 2017 1 次提交
-
-
由 Roman Kapl 提交于
If you flush (delete) a filter chain other than chain 0 (such as when deleting the device), the kernel may run into a use-after-free. The chain refcount must not be decremented unless we are sure we are done with the chain. To reproduce the bug, run: ip link add dtest type dummy tc qdisc add dev dtest ingress tc filter add dev dtest chain 1 parent ffff: flower ip link del dtest Introduced in: commit f93e1cdc ("net/sched: fix filter flushing"), but unless you have KAsan or luck, you won't notice it until commit 0dadc117 ("cls_flower: use tcf_exts_get_net() before call_rcu()") Fixes: f93e1cdc ("net/sched: fix filter flushing") Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NRoman Kapl <code@rkapl.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 11月, 2017 1 次提交
-
-
由 Cong Wang 提交于
Instead of holding netns refcnt in tc actions, we can minimize the holding time by saving it in struct tcf_exts instead. This means we can just hold netns refcnt right before call_rcu() and release it after tcf_exts_destroy() is done. However, because on netns cleanup path we call tcf_proto_destroy() too, obviously we can not hold netns for a zero refcnt, in this case we have to do cleanup synchronously. It is fine for RCU too, the caller cleanup_net() already waits for a grace period. For other cases, refcnt is non-zero and we can safely grab it as normal and release it after we are done. This patch provides two new API for each filter to use: tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can use the following pattern: void __destroy_filter() { tcf_exts_destroy(); tcf_exts_put_net(); // <== release netns refcnt kfree(); } void some_work() { rtnl_lock(); __destroy_filter(); rtnl_unlock(); } void some_rcu_callback() { tcf_queue_work(some_work); } if (tcf_exts_get_net()) // <== hold netns refcnt call_rcu(some_rcu_callback); else __destroy_filter(); Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 11月, 2017 3 次提交
-
-
由 Jiri Pirko 提交于
Add a callback that is to be called whenever head of the chain changes. Also provide a callback for the default case when the caller gets a block using non-extended getter. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Callers of tcf_block_put() could pass NULL so we can't use block->q before checking if block is NULL or not. tcf_block_put_ext() callers are fine, it is always non-NULL. Fixes: 8c4083b3 ("net: sched: add block bind/unbind notif. and extended block_get/put") Reported-by: NDave Taht <dave.taht@gmail.com> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Currently, the offload unbind is done before the chains are flushed. That causes driver to unregister block callback before it can get all the callback calls done during flush, leaving the offloaded tps inside the HW. So fix the order to prevent this situation and restore the original behaviour. Reported-by: NAlexander Duyck <alexander.duyck@gmail.com> Reported-by: NJakub Kicinski <kubakici@wp.pl> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 11月, 2017 2 次提交
-
-
由 Jiri Pirko 提交于
Since the only user, mlx5 driver does the check in mlx5e_setup_tc_block_cb, no need to check here. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This restores the original behaviour before the block callbacks were introduced. Allow the drivers to do binding of block always, no matter if the NETIF_F_HW_TC feature is on or off. Move the check to the block callback which is called for rule insertion. Reported-by: NAlexander Duyck <alexander.duyck@gmail.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 10月, 2017 1 次提交
-
-
由 Cong Wang 提交于
In commit 7aa0045d ("net_sched: introduce a workqueue for RCU callbacks of tc filter") I defer tcf_chain_flush() to a workqueue, this causes a use-after-free because qdisc is already destroyed after we queue this work. The tcf_block_put_deferred() is no longer necessary after we get RTNL for each tc filter destroy work, no others could jump in at this point. Same for tcf_chain_hold(), we are fully serialized now. This also reduces one indirection therefore makes the code more readable. Note this brings back a rcu_barrier(), however comparing to the code prior to commit 7aa0045d we still reduced one rcu_barrier(). For net-next, we can consider to refcnt tcf block to avoid it. Fixes: 7aa0045d ("net_sched: introduce a workqueue for RCU callbacks of tc filter") Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2017 2 次提交
-
-
由 Cong Wang 提交于
After previous patches, it is now safe to claim that tcf_exts_destroy() is always called with RTNL lock. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
This patch introduces a dedicated workqueue for tc filters so that each tc filter's RCU callback could defer their action destroy work to this workqueue. The helper tcf_queue_work() is introduced for them to use. Because we hold RTNL lock when calling tcf_block_put(), we can not simply flush works inside it, therefore we have to defer it again to this workqueue and make sure all flying RCU callbacks have already queued their work before this one, in other words, to ensure this is the last one to execute to prevent any use-after-free. On the other hand, this makes tcf_block_put() ugly and harder to understand. Since David and Eric strongly dislike adding synchronize_rcu(), this is probably the only solution that could make everyone happy. Please also see the code comments below. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2017 1 次提交
-
-
由 Or Gerlitz 提交于
Prior to commit b3f55bdd, the networking core doesn't wire an in-place actions list the when the low level driver is called to offload the flow, but all low level drivers do that (call tcf_exts_to_list()) in their offloading "add" logic. Now, the in-place list is set in the core which goes over the list in a loop, but also by the hw driver when their offloading code is invoked indirectly: cls_xxx add flow -> tc_setup_cb_call -> tc_exts_setup_cb_egdev_call -> hw driver which messes up the core list instance upon driver return. Fix that by avoiding in-place list on the net core code that deals with adding flows. Fixes: b3f55bdd ('net: sched: introduce per-egress action device callbacks') Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 10月, 2017 3 次提交
-
-
由 Jiri Pirko 提交于
Extend the tc_setup_cb_call entrypoint function originally used only for action egress devices callbacks to call per-block callbacks as well. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Introduce infrastructure that allows drivers to register callbacks that are called whenever tc would offload inserted rule for a specific block. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Introduce new type of ndo_setup_tc message to propage binding/unbinding of a block to driver. Call this ndo whenever qdisc gets/puts a block. Alongside with this, there's need to propagate binder type from qdisc code down to the notifier. So introduce extended variants of block_get/put in order to pass this info. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 10月, 2017 2 次提交
-
-
由 Jiri Pirko 提交于
The callers have this info, they will pass it down to tcf_fill_node. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Store net pointer in the block structure. Along the way, introduce qdisc_net helper which allows to easily obtain net pointer for qdisc instance. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-