- 02 6月, 2015 12 次提交
-
-
由 David S. Miller 提交于
Scott Feldman says: ==================== rocker: enable by default untagged VLAN support This patch set is a followup to Simon Horman's RFC patch: [PATCH/RFC net-next] rocker: by default accept untagged packets Now, on port probe, we install untagged VLAN (vid=0) support for each port as the default. This is equivalent to the command: bridge vlan add vid 0 dev DEV self Accepting untagged VLAN pkts is a reasonable default, but the user could override this with: bridge vlan del vid 0 dev DEV self With this, we no longer need 8021q module to install vid=0 when port interface opens. In fact, we don't need support for legacy VLAN ndo ops at all since they're superseded by bridge_setlink/dellink. So remove legacy VLAN ndo ops support in driver. (The legacy VLAN ndo ops are supported by bonding/team drivers, but don't fit into the transaction model offered by switchdev, so switching all VLAN functions to bridge_setlink/dellink switchdev support gets us stacked driver + transaction model support). ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Scott Feldman 提交于
Remove support for legacy ndo ops .ndo_vlan_rx_add_vid/.ndo_vlan_rx_kill_vid. Rocker will use bridge_setlink/dellink exclusively for VLAN add/del operations. The legacy ops are needed if using 8021q driver module to setup VLANs on the port. But an alternative exists in using bridge_setlink/delink to setup VLANs, which doesn't depend on 8021q module. So rocker will switch to the newer setlink/dellink ops. VLANs can added/delete from the port, regardless if port is bridged or not, using the bridge commands: bridge vlan [add|del] vid VID dev DEV self (Yes, I agree it's confusing to use the "bridge" command to set a VLAN on a non-bridged port). Using setlink/dellink over legacy ops let's us handle the stacked driver case automatically. It's built-in. setlink also pass additional flags (PVID, egress untagged) that aren't available with the legacy ops. Signed-off-by: NScott Feldman <sfeldma@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Scott Feldman 提交于
When the port joins a bridge, the port's internal VLAN ID needs to change to the bridge's internal VLAN ID. Likewise, when leaving the bridge, the internal VLAN ID reverts back the port's original internal VLAN ID. (The internal VLAN ID is used by device to internally mark untagged pkts with some VLAN, which will eventually be removed on egress...think PVID). When the internal VLAN ID changes, we need to update the VLAN table entries and the router MAC entries for IP/IPv6 to reflect the new internal VLAN ID. This patch makes use of the common rocker_port_vlan_add/del functions to make sure the tables are updated for the current internal VLAN ID. Signed-off-by: NScott Feldman <sfeldma@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Scott Feldman 提交于
On port probe, install by default untagged VLAN support. This is equivalent to running the command: bridge vlan add vid 0 dev DEV self A user could, if they wanted, manaully removing untagged support from the port by running the command: bridge vlan del vid 0 dev DEV self But installing it by default on port initialization gives the normal expected behavior. Signed-off-by: NScott Feldman <sfeldma@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Scott Feldman 提交于
Basic house keeping: If there is an error adding the router MAC for this vlan, removing the just installed VLAN table entry to leave device in same state as before failure. Signed-off-by: NScott Feldman <sfeldma@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Scott Feldman 提交于
When allocating the array of rocker port pointers, zero the array values so we can test for !NULL to see if port is allocated/registered. We'll need this later when installing untagged VLAN support for each port, during port probe. It's a long story, but to install a VLAN (vid=0 for untagged, in this case) on a port, we'll need to scan other ports to see if the VLAN group for that VLAN has been setup. To scan the other ports, we need to walk the port array. Signed-off-by: NScott Feldman <sfeldma@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Toshiaki Makita 提交于
Currently packets with non-hardware-accelerated vlan cannot be handled by GRO. This causes low performance for 802.1ad and stacked vlan, as their vlan tags are currently not stripped by hardware. This patch adds GRO support for non-hardware-accelerated vlan and improves receive performance of them. Test Environment: vlan device (.1Q) on vlan device (.1ad) on ixgbe (82599) Result: - Before $ netperf -t TCP_STREAM -H 192.168.20.2 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.00 5233.17 Rx side CPU usage: %usr %sys %irq %soft %idle 0.27 58.03 0.00 41.70 0.00 - After $ netperf -t TCP_STREAM -H 192.168.20.2 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.00 7586.85 Rx side CPU usage: %usr %sys %irq %soft %idle 0.50 25.83 0.00 59.53 14.14 [ Register VLAN offloads with priority 10 -DaveM ] Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
Remove unused function cxgb4_enable_db_coalescing() and cxgb4_disable_db_coalescing() Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Simon Horman 提交于
The rocker (switch) of a rocker_port may be trivially obtained from the latter it seems cleaner not to pass the former to a function when the latter is being passed anyway. rocker_port_rx_proc() is omitted from this change as it is a hot path case. Signed-off-by: NSimon Horman <simon.horman@netronome.com> Acked-by: NScott Feldman <sfeldma@gmail.com> Acked-by: NAndy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gabriel Krisman Bertazi 提交于
The driver allocates one page for each buffer on the rx ring, which is too much on architectures like ppc64 and can cause unexpected allocation failures when the system is under stress. Now, we keep a memory pool per queue, and if the architecture's PAGE_SIZE is greater than 4k, we fragment pages and assign each 4k segment to a ring element, which reduces the overall memory consumption on such architectures. This helps avoiding errors like the example below: [bnx2x_alloc_rx_sge:435(eth1)]Can't alloc sge [c00000037ffeb900] [d000000075eddeb4] .bnx2x_alloc_rx_sge+0x44/0x200 [bnx2x] [c00000037ffeb9b0] [d000000075ee0b34] .bnx2x_fill_frag_skb+0x1ac/0x460 [bnx2x] [c00000037ffebac0] [d000000075ee11f0] .bnx2x_tpa_stop+0x160/0x2e8 [bnx2x] [c00000037ffebb90] [d000000075ee1560] .bnx2x_rx_int+0x1e8/0xc30 [bnx2x] [c00000037ffebcd0] [d000000075ee2084] .bnx2x_poll+0xdc/0x3d8 [bnx2x] (unreliable) Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Acked-by: NYuval Mintz <Yuval.Mintz@qlogic.com> Reviewed-by: NLino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neil McKee 提交于
If new optional attribute OVS_USERSPACE_ATTR_ACTIONS is added to an OVS_ACTION_ATTR_USERSPACE action, then include the datapath actions in the upcall. This Directly associates the sampled packet with the path it takes through the virtual switch. Path information currently includes mangling, encapsulation and decapsulation actions for tunneling protocols GRE, VXLAN, Geneve, MPLS and QinQ, but this extension requires no further changes to accommodate datapath actions that may be added in the future. Adding path information enhances visibility into complex virtual networks. Signed-off-by: NNeil McKee <neil.mckee@inmon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
When we scan a packet for GRO processing, we want to see the most common packet types in the front of the offload_base list. So add a priority field so we can handle this properly. IPv4/IPv6 get the highest priority with the implicit zero priority field. Next comes ethernet with a priority of 10, and then we have the MPLS types with a priority of 15. Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Suggested-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 6月, 2015 12 次提交
-
-
由 Vaishali Thakkar 提交于
Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e, func, da; @@ -init_timer (&e); +setup_timer (&e, func, da); -e.data = da; -e.function = func; Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Sowmini Varadhan says: ==================== net/rds: SOL_RDS socket option to explicitly select transport Today the underlying transport (TCP or IB) for a PF_RDS socket is implicitly selected based on the local address used to bind(2) the PF_RDS socket. This results in some non-deterministic behavior when there are un-numbered and IPoIB interfaces sharing the same IP address. It also places the constraint that the IB interface must have an IP address (and thus, IPoIB) configured on it. The non-determinism may be avoided by providing the user-space application a socket option that allows it to explicitly select the transport prior to bind(2). Patch 1 of this series provides the constant definitions needed by the application via <linux/rds.h>. Patch 2 provides the setsockopt support, and Patch 3 provides the getsockopt support. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
The currently attached transport for a PF_RDS socket may be obtained from user space by invoking getsockopt(2) using the SO_RDS_TRANSPORT option at the SOL_RDS level. The integer optval returned will be one of the RDS_TRANS_* constants defined in linux/rds.h. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
An application may deterministically attach the underlying transport for a PF_RDS socket by invoking setsockopt(2) with the SO_RDS_TRANSPORT option at the SOL_RDS level. The integer argument to setsockopt must be one of the RDS_TRANS_* transport types, e.g., RDS_TRANS_TCP. The option must be specified before invoking bind(2) on the socket, and may only be used once on the socket. An attempt to set the option on a bound socket, or to invoke the option after a successful SO_RDS_TRANSPORT attachment, will return EOPNOTSUPP. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
User space applications that desire to explicitly select the underlying transport for a PF_RDS socket may do so by using the SO_RDS_TRANSPORT socket option at the SOL_RDS level before bind(). The integer argument provided to the socket option would be one of the RDS_TRANS_* values, e.g., RDS_TRANS_TCP. This commit exports the constant values need by such applications via <linux/rds.h> Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vaishali Thakkar 提交于
Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e1, e2, e3, e4, a, b; @@ -init_timer(&e1); +setup_timer(&e1, a, b); ... when != a = e2 when != b = e3 -e1.function = a; ... when != b = e4 -e1.data = b; Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Besides others, move bpf_tail_call_proto to the remaining definitions of other protos, improve comments a bit (i.e. remove some obvious ones, where the code is already self-documenting, add objectives for others), simplify bpf_prog_array_compatible() a bit. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
As this is already exported from tracing side via commit d9847d31 ("tracing: Allow BPF programs to call bpf_ktime_get_ns()"), we might as well want to move it to the core, so also networking users can make use of it, e.g. to measure diffs for certain flows from ingress/egress. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vaishali Thakkar 提交于
Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e1, e2, e3, e4, a, b; @@ -init_timer(&e1); +setup_timer(&e1, a, b); ... when != a = e2 when != b = e3 -e1.data = b; ... when != a = e4 -e1.function = a; Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vaishali Thakkar 提交于
Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e1, e2, e3, e4, a, b; @@ -init_timer(&e1); +setup_timer(&e1, a, b); ... when != a = e2 when != b = e3 -e1.data = b; ... when != a = e4 -e1.function = a; Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vaishali Thakkar 提交于
Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e, func, da; @@ -init_timer (&e); +setup_timer (&e, func, da); -e.data = da; -e.function = func; Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Merge tag 'mac80211-next-for-davem-2015-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== As we get closer to the merge window, here are a few more things for -next: * disconnect TDLS stations on CSA to avoid issues * fix a memory leak introduced in a recent commit * switch rfkill and cfg80211 to PM ops * in an unlikely scenario, prevent a bookkeeping value to get corrupted leading to dropped packets * fix a crash in VLAN assignment * switch rfkill-gpio to more modern gpiod API * send disconnected event to userspace with proper local/remote indication ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 5月, 2015 16 次提交
-
-
git://git.open-mesh.org/linux-merge由 David S. Miller 提交于
Antonio Quartulli says: ==================== Included changes: - checkpatch fixes - code cleanup - debugfs component is now compiled only if DEBUG_FS is selected - update copyright years - disable by default not-so-user-safe features ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexei Starovoitov 提交于
Normally the program attachment place (like sockets, qdiscs) takes care of rcu protection and calls bpf_prog_put() after a grace period. The programs stored inside prog_array may not be attached anywhere, so prog_array needs to take care of preserving rcu protection. Otherwise bpf_tail_call() will race with bpf_prog_put(). To solve that introduce bpf_prog_put_rcu() helper function and use it in 3 places where unattached program can decrement refcnt: closing program fd, deleting/replacing program in prog_array. Fixes: 04fd61ab ("bpf: allow bpf programs to tail-call other bpf programs") Reported-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
K. Y. Srinivasan says: ==================== hv_netvsc: Implement NUMA aware memory allocation Allocate both receive buffer and send buffer from the NUMA node assigned to the primary channel. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 K. Y. Srinivasan 提交于
Allocate the send buffer in a NUMA aware way. Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 K. Y. Srinivasan 提交于
Allocate the receive bufer from the NUMA node assigned to the primary channel. Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wang Long 提交于
Remove automatic variable 'err' in register_netevent_notifier() and return the result of atomic_notifier_chain_register() directly. Signed-off-by: NWang Long <long.wanglong@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next由 David S. Miller 提交于
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next, they are: 1) default CONFIG_NETFILTER_INGRESS to y for easier compile-testing of all options. 2) Allow to bind a table to net_device. This introduces the internal NFT_AF_NEEDS_DEV flag to perform a mandatory check for this binding. This is required by the next patch. 3) Add the 'netdev' table family, this new table allows you to create ingress filter basechains. This provides access to the existing nf_tables features from ingress. 4) Kill unused argument from compat_find_calc_{match,target} in ip_tables and ip6_tables, from Florian Westphal. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Florian Fainelli says: ==================== net: systemport: misc improvements These patches are highly inspired by changes from Petri on bcmgenet, last patch is a misc fix that I had pending for a while, but is not a candidate for 'net' at this point. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
Occasionnaly we may get oversized packets from the hardware which exceed the nomimal 2KiB buffer size we allocate SKBs with. Add an early check which drops the packet to avoid invoking skb_over_panic() and move on to processing the next packet. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
Currently, bcm_sysport_desc_rx() calls bcm_sysport_rx_refill() at the end of Rx packet processing loop, after the current Rx packet has already been passed to napi_gro_receive(). However, bcm_sysport_rx_refill() might fail to allocate a new Rx skb, thus leaving a hole on the Rx queue where no valid Rx buffer exists. To eliminate this situation: 1. Rewrite bcm_sysport_rx_refill() to retain the current Rx skb on the Rx queue if a new replacement Rx skb can't be allocated and DMA-mapped. In this case, the data on the current Rx skb is effectively dropped. 2. Modify bcm_sysport_desc_rx() to call bcm_sysport_rx_refill() at the top of Rx packet processing loop, so that the new replacement Rx skb is already in place before the current Rx skb is processed. This is loosely inspired from d6707bec ("net: bcmgenet: rewrite bcmgenet_rx_refill()") Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
There is a 1:1 mapping between the software maintained control block in priv->rx_cbs and the buffer address in priv->rx_bds, such that there is no need to keep computing the buffer address when refiling a control block. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julia Lawall 提交于
Delete jump to a label on the next line, when that label is not used elsewhere. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ identifier l; @@ -if (...) goto l; -l: // </smpl> Also remove the unnecessary ret variable. Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
The thunderx ethernet driver fails to build on architectures that do not have an atomic readq() and writeq() function for 64-bit PCI bus access: drivers/net/ethernet/cavium/thunder/thunder_bgx.c: In function 'bgx_reg_read': include/asm-generic/io.h:195:23: error: implicit declaration of function 'readq' [-Werror=implicit-function-declaration] It seems impossible to get this driver to work on most 32-bit hardware, so it's better to add an explicit dependency, in order to let us keep building 'allmodconfig' kernels on all architectures. As the driver is meant for the internal hardware on an arm64 SoC, this is not a problem for usability. Allowing the build on all 64-bit architectures rather than just CONFIG_ARM64 on the other hand means that we get the benefit of build testing on x86. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Or Gerlitz says: ==================== mlx4 driver update, May 28, 2015 The 1st patch fixes an issue with a function running DPDK overriding broadcast steering rules set by other functions. Please add this one to your -stable queue. The rest of the series from Matan and Ido deals with scaling the number of IRQs that serve RoCE applications to be in par with the Ethernet driver. changes from V0: - addressed feedback from Sergei, removed extra blank line in patch #4 ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matan Barak 提交于
When freeing a CQ, we need to make sure there are no asynchronous events (on the ASYNC EQ) that could relate to this CQ before freeing it. This is done by introducing synchronize_irq. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NIdo Shamay <idos@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Shamay 提交于
Now that EQs management is in the sole responsibility of mlx4_core, the IRQ affinity hints configuration should be in its hands as well. request_irq is called only once by the first consumer (maybe mlx4_ib), so mlx4_en passes the affinity mask too late. We also need to request vectors according to the cores we want to run on. mlx4_core distribution of IRQs to cores is straight forward, EQ(i)->IRQ will set affinity hint to core i. Consumers need to request EQ vectors, according to their cores considerations (NUMA). Signed-off-by: NIdo Shamay <idos@mellanox.com> Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-