- 20 8月, 2016 1 次提交
-
-
由 Herbert Xu 提交于
The commit 8f6fd83c ("rhashtable: accept GFP flags in rhashtable_walk_init") added a GFP flag argument to rhashtable_walk_init because some users wish to use the walker in an unsleepable context. In fact we don't need to allocate memory in rhashtable_walk_init at all. The walker is always paired with an iterator so we could just stash ourselves there. This patch does that by introducing a new enter function to replace the existing init function. This way we don't have to churn all the existing users again. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 8月, 2016 39 次提交
-
-
由 David S. Miller 提交于
Hariprasad Shenai says: ==================== crypto/chcr: Add support for Chelsio Crypto Driver This patch series adds support for Chelsio Crypto driver. The patch series has been created against net-next tree and includes patches for Chelsio Low Level Driver(cxgb4) and adds the new crypto Upper Layer Driver(chcr) under a new directory drivers/crypto/chelsio. Patch 1/4 ("cxgb4: Add support for dynamic allocation of resources for ULD") adds support for dynamic allocation of resources for ULD. The objective of this patch is to provide generic interface for upper layer drivers to allocate and initialize hardware resources. The present cxgb4 (network driver) apart from network functionality, also initializes hardware and thus acts as lower layer driver for other drivers to use hardware resources. Thus it acts as both a Low level driver for Upper layer driver's like iw_cxgb4, cxgb4i and cxgb4it and a Network Driver. Right now the allocation of resources for Upper layer driver's is done statically. Patch 1/4 adds a new infrastructure for dynamic allocation of resources. cxgb4 will read the hardware capability through firmware and allocate/free the queues for Upper layer drivers when the respective driver's are loaded and freed when unloaded. Patch 2/3, 3/4 and 4/4 adds support for Chelsio Crypto Driver. The Crypto driver will act as another ULD on top of cxgb4. In this patch series, the ULD API framework is used only by crypto and other ULD's will make use of it in the next series. This patch series is only for review, if this looks ok we will test it thoroughly and send request for merge. We have included all the maintainers of respective drivers. Kindly review the changes and provide feedback on the same. V3: - Removed crypto queues from cxgb4 and added support for dynamic allocation of resources for Upper layer drivers - Dependency fix in Kconfig. V2: - Some residual code cleanup - Adds pr_fmt with chcr (KBUILD_MODNAME) added - Changes var name to accomodate them <80 columns in the chcr_register_alg - Support for printing the crypto queue stats - Fix compile warnings reported by kbuild bot for certain architectures - Dependency fix in Kconfig. - If the request has the MAY_BACKLOG bit set and hardware queue is full the request is queued up else -EBUSY is returned to throttle the user. The queue when executed and processed returns -EINPROGRESS in completion. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
Adds the config entry for the Chelsio Crypto Driver, Makefile changes for the same. Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com> Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
The Chelsio's Crypto Hardware can perform the following operations: SHA1, SHA224, SHA256, SHA384 and SHA512, HMAC(SHA1), HMAC(SHA224), HMAC(SHA256), HMAC(SHA384), HAMC(SHA512), AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-XTS, AES-256-XTS This patch implements the driver for above mentioned features. This driver is an Upper Layer Driver which is attached to Chelsio's LLD (cxgb4) and uses the queue allocated by the LLD for sending the crypto requests to the Hardware and receiving the responses from it. The crypto operations can be performed by Chelsio's hardware from the userspace applications and/or from within the kernel space using the kernel's crypto API. The above mentioned crypto features have been tested using kernel's tests mentioned in testmgr.h. They also have been tested from user space using libkcapi and Openssl. Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com> Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com> Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
Add a new commmon infrastructure to allocate reosurces dynamically to Upper layer driver's(ULD) when they register with cxgb4 driver and free them during unregistering. All the queues and the interrupts for them will be allocated during ULD probe only and freed during remove. Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com> Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 LABBE Corentin 提交于
The data member of structure firmware is const and this constness is dropped by some cast. This patch add some const for keeping the const information. Signed-off-by: NLABBE Corentin <clabbe.montjoie@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Daniel Borkmann says: ==================== BPF helper improvements and cleanups This set adds various improvements to BPF helpers, a cleanup to use skb_pkt_type_ok() helper, addition of bpf_skb_change_tail(), a follow up for event output helper and removing ifdefs around the cgroupv2 helper bits. For details please see individual patches. The set is based against net-next tree, but requires a merge of net into net-next first. Thanks a lot! ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
As recently discussed during the task_under_cgroup_hierarchy() addition, we should get rid of the ifdefs surrounding the bpf_skb_under_cgroup() helper. If related functionality is not built-in, the helper cannot be used anyway, which is also in line with what we do for all other helpers. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Follow-up to 555c8a86 ("bpf: avoid stack copy and use skb ctx for event output") for also adding the event output helper for XDP typed programs. The event output helper has been very useful in particular for debugging or event notification purposes, since it's much faster and flexible than regular trace printk due to programmatically being able to attach meta data. Same flags structure applies as with tc BPF programs. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This work adds a bpf_skb_change_tail() helper for tc BPF programs. The basic idea is to expand or shrink the skb in a controlled manner. The eBPF program can then rewrite the rest via helpers like bpf_skb_store_bytes(), bpf_lX_csum_replace() and others rather than passing a raw buffer for writing here. bpf_skb_change_tail() is really a slow path helper and intended for replies with f.e. ICMP control messages. Concept is similar to other helpers like bpf_skb_change_proto() helper to keep the helper without protocol specifics and let the BPF program mangle the remaining parts. A flags field has been added and is reserved for now should we extend the helper in future. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Since we have a skb_pkt_type_ok() helper for checking the type before mangling, make use of it instead of open coding. Follow-up to commit 8b10cab6 ("net: simplify and make pkt_type_ok() available for other users") that came in after d2485c42 ("bpf: add bpf_skb_change_type helper"). Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Add TIPC_NL_PEER_REMOVE netlink command. This command can remove an offline peer node from the internal data structures. This will be supported by the tipc user space tool in iproute2. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Over the years, TCP BDP has increased a lot, and is typically in the order of ~10 Mbytes with help of clever Congestion Control modules. In presence of packet losses, TCP stores incoming packets into an out of order queue, and number of skbs sitting there waiting for the missing packets to be received can match the BDP (~10 Mbytes) In some cases, TCP needs to make room for incoming skbs, and current strategy can simply remove all skbs in the out of order queue as a last resort, incurring a huge penalty, both for receiver and sender. Unfortunately these 'last resort events' are quite frequent, forcing sender to send all packets again, stalling the flow and wasting a lot of resources. This patch cleans only a part of the out of order queue in order to meet the memory constraints. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: C. Stephen Gun <csg@google.com> Cc: Van Jacobson <vanj@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rafał Miłecki 提交于
It doesn't really change anything as BGMAC_CHIPCTL_1_IF_TYPE_RMII is equal to 0. It make code a bit clener, so far when reading it one could think we forgot to set a proper mode. It also keeps this mode code in sync with other ones. Signed-off-by: NRafał Miłecki <rafal@milecki.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rafał Miłecki 提交于
BCM53573 is a new series of Broadcom's SoCs. It's based on ARM and can be found in two packages (versions): BCM53573 and BCM47189. It shares some code with the Northstar family, but also requires some new quirks. First of all there can be up to 2 Ethernet cores on this SoC. If that is the case, they are connected to two different switch ports allowing some more complex/optimized setups. It seems the second unit doesn't come fully configured and requires some IRQ quirk. Other than that only the first core is connected to the PHY. For the second one we have to register fixed PHY (similarly to the Northstar), otherwise generic PHY driver would get some invalid info. This has been successfully tested on Tenda AC9 (BCM47189B0). Signed-off-by: NRafał Miłecki <rafal@milecki.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Colin Ian King 提交于
trivial fix to spelling mistake in dev_err message Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paul Durrant 提交于
It is useful to be able to see the hash configuration when running tests. This patch adds a debugfs node for that purpose. Signed-off-by: NPaul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: NWei Liu <wei.liu2@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
The division is already being done properly in efx_ef10_get_timer_config which returns zero-on-success, unlike the old efx_ef10_get_sysclk_freq. Fixes: d95e329a ("sfc: get timer configuration from adapter") Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
While chasing tcp_xmit_retransmit_queue() kasan issue, I found that we could avoid reading sacked field of skb that we wont send, possibly removing one cache line miss. Very minor change in slow path, but why not ? ;) Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Nikolay Aleksandrov says: ==================== net: bridge: export vlan stats per-port with flags This set adds the ability to export vlan stats per-port. Patch 01 makes that possible by consolidating the bridge and port linkxstats calls. Then patch 02 allows to dump the vlan entry flags in order to be able to distinguish between bridge and port vlan entries when dumping the master device vlan stats. That is needed because that call was implemented when the stats API didn't have slave dumping capabilities and it dumps all vlan stats (for both bridge and port entries). We also need it in order to print the vlan flags when dumping the stats. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
Use one of the vlan xstats padding fields to export the vlan flags. This is needed in order to be able to distinguish between master (bridge) and port vlan entries in user-space when dumping the bridge vlan stats. Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
In the bridge driver we usually have the same function working for both port and bridge. In order to follow that logic and also avoid code duplication, consolidate the bridge_ and brport_ linkxstats calls into one since they share most of their code. As a side effect this allows us to dump the vlan stats also via the slave call which is in preparation for the upcoming per-port vlan stats and vlan flag dumping. Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Hadar Hen Zion says: ==================== net_sched, flow_dissector, flower: Introduce vlan tag support This patchset introduce vlan tag support to the flower classifier and the flow dissector. In addition to adding vlan priority to act vlan. The first 2 patches are dealing with flow-dissector: - The first patch is a fix, in case the vlan was already stripped from the skb, take it from skb->vlan_tci. - The second patch adds support for vlan priority. The next 2 patches are dealing with flower: - The first patch is a fix, sets flow dissector 'used_keys' according to the mask value of each key. - The secound patch adds vlan tag support to the flower classifier, user space patches will be sent later to complete it. The last patch adds vlan priority to act vlan since only vlan id is currently supported. Changes from V1: - A new patch was added to this series "net_sched: flower: Avoid dissection of unmasked keys" - Adding u16 padding to struct flow_dissector_key_vlan - change flow_label field in struct flow_dissector_key_tags form 20 bits field to u32 - Remove 'if (v->tcfv_push_prio)' check from tcf_vlan_dump function - Add support to un-stripped vlan skb and skb with multipale vlans in __skb_flow_dissect ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hadar Hen Zion 提交于
The current vlan push action supports only vid and protocol options. Add priority option. Example script that adds vlan push action with vid and priority: tc filter add dev veth0 protocol ip parent ffff: \ flower \ indev veth0 \ action vlan push id 100 priority 5 Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hadar Hen Zion 提交于
Enhance flower to support 802.1Q vlan protocol classification. Currently, the supported fields are vlan_id and vlan_priority. Example: # add a flower filter with vlan id and priority classification tc filter add dev ens4f0 protocol 802.1Q parent ffff: \ flower \ indev ens4f0 \ vlan_ethtype ipv4 \ vlan_id 100 \ vlan_prio 3 \ action vlan pop Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hadar Hen Zion 提交于
The current flower implementation checks the mask range and set all the keys included in that range as "used_keys", even if a specific key in the range has a zero mask. This behavior can cause a false positive return value of dissector_uses_key function and unnecessary dissection in __skb_flow_dissect. This patch checks explicitly the mask of each key and "used_keys" will be set accordingly. Fixes: 77b9900e ('tc: introduce Flower classifier') Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hadar Hen Zion 提交于
Add vlan priority check to the flow dissector by adding new flow dissector struct, flow_dissector_key_vlan which includes vlan tag fields. vlan_id and flow_label fields were under the same struct (flow_dissector_key_tags). It was a convenient setting since struct flow_dissector_key_tags is used by struct flow_keys and by setting vlan_id and flow_label under the same struct, we get precisely 24 or 48 bytes in flow_keys from flow_dissector_key_basic. Now, when adding vlan priority support, the code will be cleaner if flow_label and vlan tag won't be under the same struct anymore. Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hadar Hen Zion 提交于
Early in the datapath skb_vlan_untag function is called, stripped the vlan from the skb and set skb->vlan_tci and skb->vlan_proto fields. The current dissection doesn't handle stripped vlan packets correctly. In some flows, vlan doesn't exist in skb->data anymore when applying flow dissection on the skb, fix that. In case vlan info wasn't stripped before applying flow_dissector (RPS flow for example), or in case of skb with multiple vlans (e.g. 802.1ad), get the vlan info from skb->data. The flow_dissector correctly skips any number of vlans and stores only the first level vlan. Fixes: 0744dd00 ('net: introduce skb_flow_dissect()') Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Yuval Mintz says: ==================== qed*: Fix ethtool issues relating to link This series addresses two issues that were introduced when adding support for ethtool's link_ksettings support - the first fixes a regression and second incorrect functionallity in the submission. Although these are fixes, as the feature currently exists only in 'next-next' I'm aiming them for it. Dave, please consider applying this series to 'net-next'. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
While '0xdead' and '0xbeef' are "great" values, we should use the correct SPEED_* values instead. Fixes: 054c67d1 ("qed*: Add support for ethtool link_ksettings callbacks") Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuval Mintz 提交于
When moving into using ethtool's link_ksetting, qed started supplying its own bitmask of speed/capabilities, but qede is still checking for the SUPPORTED value to determine whether it supports pause. Fixes: 054c67d1 ("qed*: Add support for ethtool link_ksettings callbacks") Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue由 David S. Miller 提交于
Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2016-08-18 This series contains updates to igb only. Gangfeng Huang provides all the changes in the series to update the igb driver to support advanced receive side filters that direct receive packets by flows to different hardware queues. This enables a tight control on routing a flow in the platform. First patch allows for receive network flow classification to insert and remove receive filters by ethtool. Second and third patches add the ability to insert and remove ethertype and VLAN priority filters by ethtool. Last patch just fixes an error message to return "Not supported" versus "Unknown error 524". ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gangfeng Huang 提交于
Use error "rmgr: Cannot insert RX class rule: Operation not supported" is more meaningful than "rmgr: Cannot insert RX class rule: Unknown error 524" Signed-off-by: NGangfeng Huang <gangfeng.huang@ni.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Gangfeng Huang 提交于
This patch is meant to allow for RX network flow classification to insert and remove VLAN priority filter by ethtool Example: Add an VLAN priority filter: $ ethtool -N eth0 flow-type ether vlan 0x6000 vlan-mask 0x1FFF action 2 loc 1 Show all filters: $ ethtool -n eth0 4 RX rings available Total 1 rules Filter: 1 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x0 mask: 0xFFFF VLAN EtherType: 0x0 mask: 0xffff VLAN: 0x6000 mask: 0x1fff User-defined: 0x0 mask: 0xffffffffffffffff Action: Direct to queue 2 Delete the filter by location: $ ethtool -N delete 1 Signed-off-by: NRuhao Gao <ruhao.gao@ni.com> Signed-off-by: NGangfeng Huang <gangfeng.huang@ni.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Gangfeng Huang 提交于
This patch is meant to allow for RX network flow classification to insert and remove ethertype filter by ethtool Example: Add an ethertype filter: $ ethtool -N eth0 flow-type ether proto 0x88F8 action 2 Show all filters: $ ethtool -n eth0 4 RX rings available Total 1 rules Filter: 15 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x88F8 mask: 0x0 Action: Direct to queue 2 Delete the filter by location: $ ethtool -N delete 15 Signed-off-by: NRuhao Gao <ruhao.gao@ni.com> Signed-off-by: NGangfeng Huang <gangfeng.huang@ni.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Gangfeng Huang 提交于
This patch is meant to allow for RX network flow classification to insert and remove Rx filter by ethtool. Ethtool interface has it's own rules manager Show all filters: $ ethtool -n eth0 4 RX rings available Total 2 rules Signed-off-by: NRuhao Gao <ruhao.gao@ni.com> Signed-off-by: NGangfeng Huang <gangfeng.huang@ni.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 David S. Miller 提交于
Jiri Kosina says: ==================== qdisc-hashtable fixes The following two patches fix all the issues that have been reported against the conversion of qdisc linked list to hashtable (currently in net-next) so far. First patch adjusts handling of singleton qdiscs to the new semantics, and is rather straightforward. The second patch, which fixes "cosmetic" issue of duplicate entries in the qdisc dump for ingress qdiscs, is a little bit more hairy; I personally would love to see all the already existing "if (ingress)"-like hacks go away (by, let's say, introducing a general TCQ_F_? flag), but that's way out of scope of this patchset (but already on my todo). Thanks a lot to Daniel Borkmann and David Ahern for reporting the issues and testing the patches promptly. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Kosina 提交于
tc_dump_qdisc() performs dumping of the per-device qdiscs in two phases; first, the "standard" dev->qdisc is being dumped. Second, if there is/are ingress queue(s), they are being dumped as well. After conversion of netdevice's qdisc linked-list into hashtable, these two sets are not in two disjunctive sets/lists any more, but are both "reachable" directly from netdevice's hashtable. As a consequence, the "full-depth" dump of the ingress qdiscs results in immediately hitting the netdevice hashtable again, and duplicating the dump that has already been performed for dev->qdisc. What in fact needs to be dumped in case of ingress queue is "just" the top-level ingress qdisc, as everything else has been dumped already. Fix this by extending tc_dump_qdisc_root() in a way that it can be instructed whether it should (while performing the "full" per-netdev qdisc dump) perform the whole recursion, or just dump "additional" top-level (ingress) qdiscs without performing any kind of recursion. This fixes duplicate dumps such as qdisc mq 0: root qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc clsact ffff: parent ffff:fff1 qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Fixes: 59cc1f61 ("net: sched: convert qdisc linked list to hashtable") Reported-by: NDaniel Borkmann <daniel@iogearbox.net> Tested-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Kosina 提交于
qdisc_match_from_root() is now iterating over per-netdevice qdisc hashtable instead of going through a linked-list of qdiscs (independently on the actual underlying netdev), which was the case before the switch to hashtable for qdiscs. For singleton qdiscs, there is no underlying netdev associated though, and therefore dumping a singleton qdisc will panic, as qdisc_dev(root) will always be NULL. BUG: unable to handle kernel NULL pointer dereference at 0000000000000410 IP: [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70 PGD 1aceba067 PUD 1aceb7067 PMD 0 Oops: 0000 [#1] PREEMPT SMP [ ... ] task: ffff8801ec996e00 task.stack: ffff8801ec934000 RIP: 0010:[<ffffffff8167efac>] [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70 RSP: 0018:ffff8801ec937ab0 EFLAGS: 00010203 RAX: 0000000000000408 RBX: ffff88025e612000 RCX: ffffffffffffffd8 RDX: 0000000000000000 RSI: 00000000ffff0000 RDI: ffffffff81cf8100 RBP: ffff8801ec937ab0 R08: 000000000001c160 R09: ffff8802668032c0 R10: ffffffff81cf8100 R11: 0000000000000030 R12: 00000000ffff0000 R13: ffff88025e612000 R14: ffffffff81cf3140 R15: 0000000000000000 FS: 00007f24b9af6740(0000) GS:ffff88026f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000410 CR3: 00000001aceec000 CR4: 00000000001406e0 Stack: ffff8801ec937ad0 ffffffff81681210 ffff88025dd51a00 00000000fffffff1 ffff8801ec937b88 ffffffff81681e4e ffffffff81c42bc0 ffff880262431500 ffffffff81cf3140 ffff88025dd51a10 ffff88025dd51a24 00000000ec937b38 Call Trace: [<ffffffff81681210>] qdisc_lookup+0x40/0x50 [<ffffffff81681e4e>] tc_modify_qdisc+0x21e/0x550 [<ffffffff8166ae25>] rtnetlink_rcv_msg+0x95/0x220 [<ffffffff81209602>] ? __kmalloc_track_caller+0x172/0x230 [<ffffffff8166ad90>] ? rtnl_newlink+0x870/0x870 [<ffffffff816897b7>] netlink_rcv_skb+0xa7/0xc0 [<ffffffff816657c8>] rtnetlink_rcv+0x28/0x30 [<ffffffff8168919b>] netlink_unicast+0x15b/0x210 [<ffffffff81689569>] netlink_sendmsg+0x319/0x390 [<ffffffff816379f8>] sock_sendmsg+0x38/0x50 [<ffffffff81638296>] ___sys_sendmsg+0x256/0x260 [<ffffffff811b1275>] ? __pagevec_lru_add_fn+0x135/0x280 [<ffffffff811b1a90>] ? pagevec_lru_move_fn+0xd0/0xf0 [<ffffffff811b1140>] ? trace_event_raw_event_mm_lru_insertion+0x180/0x180 [<ffffffff811b1b85>] ? __lru_cache_add+0x75/0xb0 [<ffffffff817708a6>] ? _raw_spin_unlock+0x16/0x40 [<ffffffff811d8dff>] ? handle_mm_fault+0x39f/0x1160 [<ffffffff81638b15>] __sys_sendmsg+0x45/0x80 [<ffffffff81638b62>] SyS_sendmsg+0x12/0x20 [<ffffffff810038e7>] do_syscall_64+0x57/0xb0 Fix this by special-casing singleton qdiscs (those that don't have underlying netdevice) and introduce immediate handling of those rather than trying to go over an underlying netdevice. We're in the same situation in tc_dump_qdisc_root() and tc_dump_tclass_root(). Ultimately, this will have to be slightly reworked so that we are actually able to show singleton qdiscs (noop) in the dump properly; but we're not currently doing that anyway, so no regression there, and better do this in a gradual manner. Fixes: 59cc1f61 ("net: sched: convert qdisc linked list to hashtable") Reported-by: NDaniel Borkmann <daniel@iogearbox.net> Tested-by: NDaniel Borkmann <daniel@iogearbox.net> Reported-by: NDavid Ahern <dsa@cumulusnetworks.com> Tested-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-