- 14 2月, 2017 1 次提交
-
-
由 Tobias Klauser 提交于
garp_port is only used in net/802/garp.c which is only compiled with CONFIG_GARP enabled. Same goes for mrp_port which is only used in net/802/mrp.c with CONFIG_MRP enabled. Only include the two members in struct net_device if their respective CONFIG_* is enabled. This saves a few bytes in struct net_device in case CONFIG_GARP or CONFIG_MRP are not enabled. Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 2月, 2017 1 次提交
-
-
由 Willem de Bruijn 提交于
The stack must not pass packets to device drivers that are shorter than the minimum link layer header length. Previously, packet sockets would drop packets smaller than or equal to dev->hard_header_len, but this has false positives. Zero length payload is used over Ethernet. Other link layer protocols support variable length headers. Support for validation of these protocols removed the min length check for all protocols. Introduce an explicit dev->min_header_len parameter and drop all packets below this value. Initially, set it to non-zero only for Ethernet and loopback. Other protocols can follow in a patch to net-next. Fixes: 9ed988cd ("packet: validate variable length ll headers") Reported-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 2月, 2017 1 次提交
-
-
由 Ido Schimmel 提交于
In commit 18bfb924 ("net: introduce default neigh_construct/destroy ndo calls for L2 upper devices") we added these ndos to stacked devices such as team and bond, so that calls will be propagated to mlxsw. However, previous commit removed the reliance on these ndos and no new users of these ndos have appeared since above mentioned commit. We can therefore safely remove this dead code. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 2月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
All __napi_complete() callers have been converted to use the more standard napi_complete_done(), we can now remove this NAPI method for good. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
We added generic support for busy polling in NAPI layer in linux-4.5 No network driver uses ndo_busy_poll() anymore, we can get rid of the pointer in struct net_device_ops, and its use in sk_busy_loop() Saves NETIF_F_BUSY_POLL features bit. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 2月, 2017 1 次提交
-
-
由 Dimitris Michailidis 提交于
Commit cdba756f ("net: move ndo_features_check() close to ndo_start_xmit()") inadvertently moved the doc comment for .ndo_fix_features instead of .ndo_features_check. Fix the comment ordering. Fixes: cdba756f ("net: move ndo_features_check() close to ndo_start_xmit()") Signed-off-by: NDimitris Michailidis <dmichail@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 1月, 2017 1 次提交
-
-
由 Edward Cree 提交于
For reporting things that may or may not be serious, depending on some condition, netif_cond_dbg will check the condition and print the report at either dbg (if the condition is true) or the specified level. Suggested-by: NJon Cooper <jcooper@solarflare.com> Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2017 1 次提交
-
-
由 Florian Fainelli 提交于
Commit 16e5cc64 ("net: rework setup_tc ndo op to consume general tc operand") changed the ndo_setup_tc() signature, but did not update the comments in netdevice.h, so do that now. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Acked-by: NJohn Fastabend <john.r.fastabend@intel.com> Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 1月, 2017 1 次提交
-
-
由 Tobias Klauser 提交于
The network stack no longer uses the last_rx member of struct net_device since the bonding driver switched to use its own private last_rx in commit 9f242738 ("bonding: use last_arp_rx in slave_last_rx()"). However, some drivers still (ab)use the field for their own purposes and some driver just update it without actually using it. Previously, there was an accompanying comment for the last_rx member added in commit 4dc89133 ("net: add a comment on netdev->last_rx") which asked drivers not to update is, unless really needed. However, this commend was removed in commit f8ff080d ("bonding: remove useless updating of slave->dev->last_rx"), so some drivers added later on still did update last_rx. Remove all usage of last_rx and switch three drivers (sky2, atp and smc91c92_cs) which actually read and write it to use their own private copy in netdev_priv. Compile-tested with allyesconfig and allmodconfig on x86 and arm. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Mirko Lindner <mlindner@marvell.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Acked-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NJay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 1月, 2017 1 次提交
-
-
由 Florian Fainelli 提交于
netif_wake_subqueue() is duplicating the same thing that netif_tx_wake_queue() does, so make it call it directly after looking up the queue from the index. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 1月, 2017 1 次提交
-
-
由 Herbert Xu 提交于
The GRO fast path caches the frag0 address. This address becomes invalid if frag0 is modified by pskb_may_pull or its variants. So whenever that happens we must disable the frag0 optimization. This is usually done through the combination of gro_header_hard and gro_header_slow, however, the IPv6 extension header path did the pulling directly and would continue to use the GRO fast path incorrectly. This patch fixes it by disabling the fast path when we enter the IPv6 extension header path. Fixes: 78a478d0 ("gro: Inline skb_gro_header and cache frag0 virtual address") Reported-by: NSlava Shwartsman <slavash@mellanox.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 1月, 2017 1 次提交
-
-
由 stephen hemminger 提交于
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 12月, 2016 1 次提交
-
-
由 Matthias Tafelmeier 提交于
Oftenly, introducing side effects on packet processing on the other half of the stack by adjusting one of TX/RX via sysctl is not desirable. There are cases of demand for asymmetric, orthogonal configurability. This holds true especially for nodes where RPS for RFS usage on top is configured and therefore use the 'old dev_weight'. This is quite a common base configuration setup nowadays, even with NICs of superior processing support (e.g. aRFS). A good example use case are nodes acting as noSQL data bases with a large number of tiny requests and rather fewer but large packets as responses. It's affordable to have large budget and rx dev_weights for the requests. But as a side effect having this large a number on TX processed in one run can overwhelm drivers. This patch therefore introduces an independent configurability via sysctl to userland. Signed-off-by: NMatthias Tafelmeier <matthias.tafelmeier@gmx.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 12月, 2016 1 次提交
-
-
由 Eric Dumazet 提交于
RFS is not commonly used, so add a jump label to avoid some conditionals in fast path. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 12月, 2016 1 次提交
-
-
由 Hadar Hen Zion 提交于
In order to support hardware offloading when the device given by the tc rule is different from the Hardware underline device, extract the mirred (egress) device from the tc action when a filter is added, using the new tc_action_ops, get_dev(). Flower caches the information about the mirred device and use it for calling ndo_setup_tc in filter change, update stats and delete. Calling ndo_setup_tc of the mirred (egress) device instead of the ingress device will allow a resolution between the software ingress device and the underline hardware device. The resolution will take place inside the offloading driver using 'egress_device' flag added to tc_to_netdev struct which is provided to the offloading driver. Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 11月, 2016 1 次提交
-
-
由 Daniel Borkmann 提交于
Add an IFLA_XDP_FLAGS attribute that can be passed for setting up XDP along with IFLA_XDP_FD, which eventually allows user space to implement typical add/replace/delete logic for programs. Right now, calling into dev_change_xdp_fd() will always replace previous programs. When passed XDP_FLAGS_UPDATE_IF_NOEXIST, we can handle this more graceful when requested by returning -EBUSY in case we try to attach a new program, but we find that another one is already attached. This will be used by upcoming front-end for iproute2 as well. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 11月, 2016 1 次提交
-
-
由 Michael S. Tsirkin 提交于
sparse warns about context imbalance in any code that uses HARD_TX_LOCK/UNLOCK - this is because it's unable to determine that flags don't change so lock and unlock are paired. Seems easy enough to fix by adding __acquire/__release calls. With this patch af_packet.c is now sparse-clean, Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 11月, 2016 1 次提交
-
-
由 Or Gerlitz 提交于
Some drivers would need to check few internal matters for that. To be used in downstream mlx5 commit. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 11月, 2016 1 次提交
-
-
由 Randy Dunlap 提交于
Fix kernel-doc warning in <linux/netdevice.h> (missing ':'): ..//include/linux/netdevice.h:1904: warning: No description found for parameter 'prio_tc_map[TC_BITMASK + 1]' Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 11月, 2016 3 次提交
-
-
由 Eric Dumazet 提交于
Callers of netpoll_poll_lock() own NAPI_STATE_SCHED Callers of netpoll_poll_unlock() have BH blocked between the NAPI_STATE_SCHED being cleared and poll_lock is released. We can avoid the spinlock which has no contention, and use cmpxchg() on poll_owner which we need to set anyway. This removes a possible lockdep violation after the cited commit, since sk_busy_loop() re-enables BH before calling busy_poll_stop() Fixes: 217f6974 ("net: busy-poll: allow preemption in sk_busy_loop()") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
NAPI drivers use napi_complete_done() or napi_complete() when they drained RX ring and right before re-enabling device interrupts. In busy polling, we can avoid interrupts being delivered since we are polling RX ring in a controlled loop. Drivers can chose to use napi_complete_done() return value to reduce interrupts overhead while busy polling is active. This is optional, legacy drivers should work fine even if not updated. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Adam Belay <abelay@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Yuval Mintz <Yuval.Mintz@cavium.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
After commit 4cd13c21 ("softirq: Let ksoftirqd do its job"), sk_busy_loop() needs a bit of care : softirqs might be delayed since we do not allow preemption yet. This patch adds preemptiom points in sk_busy_loop(), and makes sure no unnecessary cache line dirtying or atomic operations are done while looping. A new flag is added into napi->state : NAPI_STATE_IN_BUSY_POLL This prevents napi_complete_done() from clearing NAPIF_STATE_SCHED, so that sk_busy_loop() does not have to grab it again. Similarly, netpoll_poll_lock() is done one time. This gives about 10 to 20 % improvement in various busy polling tests, especially when many threads are busy polling in configurations with large number of NIC queues. This should allow experimenting with bigger delays without hurting overall latencies. Tested: On a 40Gb mlx4 NIC, 32 RX/TX queues. echo 70 >/proc/sys/net/core/busy_read for i in `seq 1 40`; do echo -n $i: ; ./super_netperf $i -H lpaa24 -t UDP_RR -- -N -n; done Before: After: 1: 90072 92819 2: 157289 184007 3: 235772 213504 4: 344074 357513 5: 394755 458267 6: 461151 487819 7: 549116 625963 8: 544423 716219 9: 720460 738446 10: 794686 837612 11: 915998 923960 12: 937507 925107 13: 1019677 971506 14: 1046831 1113650 15: 1114154 1148902 16: 1105221 1179263 17: 1266552 1299585 18: 1258454 1383817 19: 1341453 1312194 20: 1363557 1488487 21: 1387979 1501004 22: 1417552 1601683 23: 1550049 1642002 24: 1568876 1601915 25: 1560239 1683607 26: 1640207 1745211 27: 1706540 1723574 28: 1638518 1722036 29: 1734309 1757447 30: 1782007 1855436 31: 1724806 1888539 32: 1717716 1944297 33: 1778716 1869118 34: 1805738 1983466 35: 1815694 2020758 36: 1893059 2035632 37: 1843406 2034653 38: 1888830 2086580 39: 1972827 2143567 40: 1877729 2181851 Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Adam Belay <abelay@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Yuval Mintz <Yuval.Mintz@cavium.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 11月, 2016 1 次提交
-
-
由 Martin KaFai Lau 提交于
If the bpf program calls bpf_redirect(dev, 0) and dev is an ipip/ip6tnl, it currently includes the mac header. e.g. If dev is ipip, the end result is IP-EthHdr-IP instead of IP-IP. The fix is to pull the mac header. At ingress, skb_postpull_rcsum() is not needed because the ethhdr should have been pulled once already and then got pushed back just before calling the bpf_prog. At egress, this patch calls skb_postpull_rcsum(). If bpf_redirect(dev, BPF_F_INGRESS) is called, it also fails now because it calls dev_forward_skb() which eventually calls eth_type_trans(skb, dev). The eth_type_trans() will set skb->type = PACKET_OTHERHOST because the mac address does not match the redirecting dev->dev_addr. The PACKET_OTHERHOST will eventually cause the ip_rcv() errors out. To fix this, ____dev_forward_skb() is added. Joint work with Daniel Borkmann. Fixes: cfc7381b ("ip_tunnel: add collect_md mode to IPIP tunnel") Fixes: 8d79266b ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@fb.com> Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 11月, 2016 1 次提交
-
-
由 Eric Dumazet 提交于
There are no more users except from net/core/dev.c napi_hash_add() can now be static. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 11月, 2016 3 次提交
-
-
由 Alexander Duyck 提交于
This patch adds support for setting and using XPS when QoS via traffic classes is enabled. With this change we will factor in the priority and traffic class mapping of the packet and use that information to correctly select the queue. This allows us to define a set of queues for a given traffic class via mqprio and then configure the XPS mapping for those queues so that the traffic flows can avoid head-of-line blocking between the individual CPUs if so desired. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
Add a sysfs attribute for a Tx queue that allows us to determine the traffic class for a given queue. This will allow us to more easily determine this in the future. It is needed as XPS will take the traffic class for a group of queues into account in order to avoid pulling traffic from one traffic class into another. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
The functions for configuring the traffic class to queue mappings have other effects that need to be addressed. Instead of trying to export a bunch of new functions just relocate the functions so that we can instrument them directly with the functionality they will need. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 10月, 2016 1 次提交
-
-
由 Sabrina Dubroca 提交于
Currently, GRO can do unlimited recursion through the gro_receive handlers. This was fixed for tunneling protocols by limiting tunnel GRO to one level with encap_mark, but both VLAN and TEB still have this problem. Thus, the kernel is vulnerable to a stack overflow, if we receive a packet composed entirely of VLAN headers. This patch adds a recursion counter to the GRO layer to prevent stack overflow. When a gro_receive function hits the recursion limit, GRO is aborted for this skb and it is processed normally. This recursion counter is put in the GRO CB, but could be turned into a percpu counter if we run out of space in the CB. Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report. Fixes: CVE-2016-7039 Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.") Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan") Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Reviewed-by: NJiri Benc <jbenc@redhat.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 10月, 2016 1 次提交
-
-
由 Ido Schimmel 提交于
Tamir reported the following trace when processing ARP requests received via a vlan device on top of a VLAN-aware bridge: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0] [...] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 4.8.0-rc7 #1 Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016 task: ffff88017edfea40 task.stack: ffff88017ee10000 RIP: 0010:[<ffffffff815dcc73>] [<ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60 [...] Call Trace: <IRQ> [<ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum] [<ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum] [<ffffffff810ad07a>] notifier_call_chain+0x4a/0x70 [<ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20 [<ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20 [<ffffffff815f2eb6>] neigh_update+0x306/0x740 [<ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0 [<ffffffff8165ea3f>] arp_process+0x66f/0x700 [<ffffffff8170214c>] ? common_interrupt+0x8c/0x8c [<ffffffff8165ec29>] arp_rcv+0x139/0x1d0 [<ffffffff816e505a>] ? vlan_do_receive+0xda/0x320 [<ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0 [<ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20 [<ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge] [<ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge] [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60 [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0 [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70 [<ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge] [<ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge] [<ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge] [<ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge] [<ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge] [<ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0 [<ffffffff81374815>] ? find_next_bit+0x15/0x20 [<ffffffff8135e802>] ? cpumask_next_and+0x32/0x50 [<ffffffff810c9968>] ? load_balance+0x178/0x9b0 [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60 [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0 [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70 [<ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum] [<ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core] [<ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci] [<ffffffff81091986>] tasklet_action+0xf6/0x110 [<ffffffff81704556>] __do_softirq+0xf6/0x280 [<ffffffff8109213f>] irq_exit+0xdf/0xf0 [<ffffffff817042b4>] do_IRQ+0x54/0xd0 [<ffffffff8170214c>] common_interrupt+0x8c/0x8c The problem is that netdev_all_lower_get_next_rcu() never advances the iterator, thereby causing the loop over the lower adjacency list to run forever. Fix this by advancing the iterator and avoid the infinite loop. Fixes: 7ce856aa ("mlxsw: spectrum: Add couple of lower device helper functions") Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Reported-by: NTamir Winetroub <tamirw@mellanox.com> Reviewed-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 10月, 2016 2 次提交
-
-
由 David Ahern 提交于
Only direct adjacencies are maintained. All upper or lower devices can be learned via the new walk API which recursively walks the adj_list for upper devices or lower devices. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
This patch introduces netdev_walk_all_upper_dev_rcu, netdev_walk_all_lower_dev and netdev_walk_all_lower_dev_rcu. These functions recursively walk the adj_list of devices to determine all upper and lower devices. The functions take a callback function that is invoked for each device in the list. If the callback returns non-0, the walk is terminated and the functions return that code back to callers. v3 - simplified netdev_has_upper_dev_all_rcu and __netdev_has_upper_dev and removed typecast as suggested by Stephen v2 - fixed definition of netdev_next_lower_dev_rcu to mirror the upper_dev version. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 10月, 2016 1 次提交
-
-
由 stephen hemminger 提交于
This reverts commit 6ae23ad3. The code has been in kernel since 4.4 but there are no in tree code that uses. Unused code is broken code, remove it. Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 10月, 2016 1 次提交
-
-
由 Jarod Wilson 提交于
While looking into an MTU issue with sfc, I started noticing that almost every NIC driver with an ndo_change_mtu function implemented almost exactly the same range checks, and in many cases, that was the only practical thing their ndo_change_mtu function was doing. Quite a few drivers have either 68, 64, 60 or 46 as their minimum MTU value checked, and then various sizes from 1500 to 65535 for their maximum MTU value. We can remove a whole lot of redundant code here if we simple store min_mtu and max_mtu in net_device, and check against those in net/core/dev.c's dev_set_mtu(). In theory, there should be zero functional change with this patch, it just puts the infrastructure in place. Subsequent patches will attempt to start using said infrastructure, with theoretically zero change in functionality. CC: netdev@vger.kernel.org Signed-off-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 9月, 2016 1 次提交
-
-
由 Aaron Conole 提交于
The netfilter hook list never uses the prev pointer, and so can be trimmed to be a simple singly-linked list. In addition to having a more light weight structure for hook traversal, struct net becomes 5568 bytes (down from 6400) and struct net_device becomes 2176 bytes (down from 2240). Signed-off-by: NAaron Conole <aconole@bytheb.org> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 24 9月, 2016 1 次提交
-
-
由 Moshe Shemesh 提交于
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving the ability for user-space application to specify it for the VF, as an option to support 802.1ad. We adjusted IP Link tool to support this option. For future use cases, the new UAPI supports multiple vlans. For now we limit the list size to a single vlan in kernel. Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward compatibility with older versions of IP Link tool. Add a vlan protocol parameter to the ndo_set_vf_vlan callback. We kept 802.1Q as the drivers' default vlan protocol. Suitable ip link tool command examples: Set vf vlan protocol 802.1ad: ip link set eth0 vf 1 vlan 100 proto 802.1ad Set vf to VST (802.1Q) mode: ip link set eth0 vf 1 vlan 100 proto 802.1Q Or by omitting the new parameter ip link set eth0 vf 1 vlan 100 Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 9月, 2016 1 次提交
-
-
由 Jakub Kicinski 提交于
This patch adds hardware offload capability to cls_bpf classifier, similar to what have been done with U32 and flower. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 9月, 2016 1 次提交
-
-
由 Nogah Frankel 提交于
Add a new ndo to return statistics for offloaded operation. Since there can be many different offloaded operation with many stats types, the ndo gets an attribute id by which it knows which stats are wanted. The ndo also gets a void pointer to be cast according to the attribute id. Signed-off-by: NNogah Frankel <nogahf@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 9月, 2016 1 次提交
-
-
由 Mahesh Bandewar 提交于
Following few steps will crash kernel - (a) Create bonding master > modprobe bonding miimon=50 (b) Create macvlan bridge on eth2 > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \ type macvlan (c) Now try adding eth2 into the bond > echo +eth2 > /sys/class/net/bond0/bonding/slaves <crash> Bonding does lots of things before checking if the device enslaved is busy or not. In this case when the notifier call-chain sends notifications, the bond_netdev_event() assumes that the rx_handler /rx_handler_data is registered while the bond_enslave() hasn't progressed far enough to register rx_handler for the new slave. This patch adds a rx_handler check that can be performed right at the beginning of the enslave code to avoid getting into this situation. Signed-off-by: NMahesh Bandewar <maheshb@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 9月, 2016 1 次提交
-
-
由 Roopa Prabhu 提交于
fdb dumps spanning multiple skb's currently restart from the first interface again for every skb. This results in unnecessary iterations on the already visited interfaces and their fdb entries. In large scale setups, we have seen this to slow down fdb dumps considerably. On a system with 30k macs we see fdb dumps spanning across more than 300 skbs. To fix the problem, this patch replaces the existing single fdb marker with three markers: netdev hash entries, netdevs and fdb index to continue where we left off instead of restarting from the first netdev. This is consistent with link dumps. In the process of fixing the performance issue, this patch also re-implements fix done by commit 472681d5 ("net: ndo_fdb_dump should report -EMSGSIZE to rtnl_fdb_dump") (with an internal fix from Wilson Kok) in the following ways: - change ndo_fdb_dump handlers to return error code instead of the last fdb index - use cb->args strictly for dump frag markers and not error codes. This is consistent with other dump functions. Below results were taken on a system with 1000 netdevs and 35085 fdb entries: before patch: $time bridge fdb show | wc -l 15065 real 1m11.791s user 0m0.070s sys 1m8.395s (existing code does not return all macs) after patch: $time bridge fdb show | wc -l 35085 real 0m2.017s user 0m0.113s sys 0m1.942s Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NWilson Kok <wkok@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 8月, 2016 1 次提交
-
-
由 Ido Schimmel 提交于
switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of port netdevs so that packets being flooded by the device won't be flooded twice. It works by assigning a unique identifier (the ifindex of the first bridge port) to bridge ports sharing the same parent ID. This prevents packets from being flooded twice by the same switch, but will flood packets through bridge ports belonging to a different switch. This method is problematic when stacked devices are taken into account, such as VLANs. In such cases, a physical port netdev can have upper devices being members in two different bridges, thus requiring two different 'offload_fwd_mark's to be configured on the port netdev, which is impossible. The main problem is that packet and netdev marking is performed at the physical netdev level, whereas flooding occurs between bridge ports, which are not necessarily port netdevs. Instead, packet and netdev marking should really be done in the bridge driver with the switch driver only telling it which packets it already forwarded. The bridge driver will mark such packets using the mark assigned to the ingress bridge port and will prevent the packet from being forwarded through any bridge port sharing the same mark (i.e. having the same parent ID). Remove the current switchdev 'offload_fwd_mark' implementation and instead implement the proposed method. In addition, make rocker - the sole user of the mark - use the proposed method. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-