- 20 7月, 2017 7 次提交
-
-
由 Tonghao Zhang 提交于
When calling the flow_free() to free the flow, we call many times (cpu_possible_mask, eg. 128 as default) cpumask_next(). That will take up our CPU usage if we call the flow_free() frequently. When we put all packets to userspace via upcall, and OvS will send them back via netlink to ovs_packet_cmd_execute(will call flow_free). The test topo is shown as below. VM01 sends TCP packets to VM02, and OvS forward packtets. When testing, we use perf to report the system performance. VM01 --- OvS-VM --- VM02 Without this patch, perf-top show as below: The flow_free() is 3.02% CPU usage. 4.23% [kernel] [k] _raw_spin_unlock_irqrestore 3.62% [kernel] [k] __do_softirq 3.16% [kernel] [k] __memcpy 3.02% [kernel] [k] flow_free 2.42% libc-2.17.so [.] __memcpy_ssse3_back 2.18% [kernel] [k] copy_user_generic_unrolled 2.17% [kernel] [k] find_next_bit When applied this patch, perf-top show as below: Not shown on the list anymore. 4.11% [kernel] [k] _raw_spin_unlock_irqrestore 3.79% [kernel] [k] __do_softirq 3.46% [kernel] [k] __memcpy 2.73% libc-2.17.so [.] __memcpy_ssse3_back 2.25% [kernel] [k] copy_user_generic_unrolled 1.89% libc-2.17.so [.] _int_malloc 1.53% ovs-vswitchd [.] xlate_actions With this patch, the TCP throughput(we dont use Megaflow Cache + Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec (maybe ~10% performance imporve). This patch adds cpumask struct, the cpu_used_mask stores the cpu_id that the flow used. And we only check the flow_stats on the cpu we used, and it is unncessary to check all possible cpu when getting, cleaning, and updating the flow_stats. Adding the cpu_used_mask to sw_flow struct does’t increase the cacheline number. Signed-off-by: NTonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: NPravin B Shelar <pshelar@ovn.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tonghao Zhang 提交于
In the ovs_flow_stats_update(), we only use the node var to alloc flow_stats struct. But this is not a common case, it is unnecessary to call the numa_node_id() everytime. This patch is not a bugfix, but there maybe a small increase. Signed-off-by: NTonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Rick Farrington says: ==================== liquidio: avoid vm low memory crashes This patchset addresses issues brought about by low memory conditions in a VM. These conditions were not seen when the driver was exercised normally. Rather, they were brought about through manual fault injection. They are being included in the interest of hardening the driver against unforeseen circumstances. 1. Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'. This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries. 2. Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers(). 3. For defensive programming, zero the allocated block 'oct->droq' in octeon_setup_output_queues() and 'oct->instr_queue' in octeon_setup_instr_queues(). change log: V1 -> V2: 1. Corrected syntax in 'Subject' lines; no functional or code changes. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rick Farrington 提交于
For defensive programming, zero the allocated block 'oct->droq[0]' in octeon_setup_output_queues() and 'oct->instr_queue[0]' in octeon_setup_instr_queues(). Signed-off-by: NRick Farrington <ricardo.farrington@cavium.com> Signed-off-by: NSatanand Burla <satananda.burla@cavium.com> Signed-off-by: NRaghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rick Farrington 提交于
Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers(). Signed-off-by: NRick Farrington <ricardo.farrington@cavium.com> Signed-off-by: NSatanand Burla <satananda.burla@cavium.com> Signed-off-by: NRaghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rick Farrington 提交于
Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'. This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries. Signed-off-by: NRick Farrington <ricardo.farrington@cavium.com> Signed-off-by: NSatanand Burla <satananda.burla@cavium.com> Signed-off-by: NRaghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rick Farrington 提交于
Added support for new firmware statistic 'tx_err_pki'. Signed-off-by: NRick Farrington <ricardo.farrington@cavium.com> Signed-off-by: NDerek Chickles <derek.chickles@cavium.com> Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 7月, 2017 33 次提交
-
-
由 David S. Miller 提交于
Arvind Yadav says: ==================== constify net attribute_group structures. attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 28720 985 12 29717 7415 net/.../cxgb3/cxgb3_main.o File size After adding 'const': text data bss dec hex filename 28848 857 12 29717 7415 net/.../cxgb3/cxgb3_main.o Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 4512 1472 0 5984 1760 drivers/net/bonding/bond_sysfs.o File size After adding 'const': text data bss dec hex filename 4576 1408 0 5984 1760 drivers/net/bonding/bond_sysfs.o Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 3409 948 28 4385 1121 drivers/net/arcnet/com20020-pci.o File size After adding 'const': text data bss dec hex filename 3473 884 28 4385 1121 drivers/net/arcnet/com20020-pci.o Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 11800 368 0 12168 2f88 drivers/net/can/janz-ican3.o File size After adding 'const': text data bss dec hex filename 11864 304 0 12168 2f88 drivers/net/can/janz-ican3.o Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 6164 304 0 6468 1944 drivers/net/can/at91_can.o File size After adding 'const': text data bss dec hex filename 6228 240 0 6468 1944 drivers/net/can/at91_can.o Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arvind Yadav 提交于
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 13275 928 1 14204 377c drivers/net/usb/cdc_ncm.o File size After adding 'const': text data bss dec hex filename 13339 864 1 14204 377c drivers/net/usb/cdc_ncm.o Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Jiri Pirko says: ==================== mlxsw: Preparations for IPv6 UC router Ido says: The purpose of this set is to prepare the driver for the introduction of IPv6 FIB offload. It's mainly composed of small and non-functional changes, that either add the IPv6 equivalent of existing IPv4 code or aimed at making the introduction of IPv6-specific code easier. The first five patches enable IPv6 forwarding in the device and allow us to configure router interfaces (RIFs) based on inet6addr notifications. The next six patches add support for programming IPv6 neighbours into the device's table as well as dumping their activity and updating the kernel accordingly. The last 11 patches extend current infrastructure to allow us to program IPv6 routes, set catch-all IPv6 trap in case of abort and make the code more receptive towards up-coming changes. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
The number of possible prefix lengths for IPv6 is 129 and not 128. Fixes following warning from UBSAN when /128 routes are offloaded: UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:2510:27 index 128 is out of range for type 'long unsigned int [128]' Fixes: 5e9c16cc ("mlxsw: spectrum_router: Implement private fib") Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
These functions aren't specific to IPv4 and can be re-used for IPv6. Drop the '4' designation from their name. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
Functions that take as argument a FIB entry don't need to take FIB node as well, as it can be extracted from the entry. Remove unnecessary FIB node parameter. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
The functions to create and destroy a nexthop group are IPv4 specific and should be renamed accordingly, so that they won't be confused with the IPv6 specific functions in follow-up patches. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
Some of the parameters stored in the FIB entry structure are specific to IPv4 and therefore better placed in an IPv4 specific structure. Create an IPv4 specific structure that encapsulates the common FIB entry structure and contains IPv4 specific parameters. In a follow-up patchset an IPv6 specific structure will be introduced. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
When we fail to insert a route we invoke the abort mechanism which flushes all the tables and inserts a default route in each, so that all packets incoming to the router will be trapped to the CPU. Upon abort, add an IPv6 default route to the IPv6 tables. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
Take advantage of previous patch and allow the RALUE register to be called with IPv6 routes. In order to re-use as much code as possible between IPv4 and IPv6, only the lowest-level function that actually does the register packing is demuxed based on the passed protocol. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
Update the register so that IPv6 LPM entries could be programmed to the device's table. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
A Virtual Router (VR) is an entity which corresponds to a VRF and performs FIB lookup in an LPM tree according to the {VR, IP Proto} -> Tree binding. Extend the virtual router data structure towards IPv6 FIB offload. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
A FIB node is an entity which stores routes sharing the same prefix and length. The data structure itself is already family agnostic, but we make some of its operations agnostic as well and thus re-use them for IPv6 offload. Instead of passing an IPv4-specific structure to fib4_node_get(), pass general routing parameters and rename the function accordingly. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
When looking up a FIB entry we shouldn't create the FIB node where it's supposed to be linked in case the node doesn't already exist. Instead, lookup the node and fail if it doesn't exist. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
Thankfully, the neighbour subsystem is agnostic to the upper protocol and used by both IPv4 and IPv6. By removing assumptions regarding the neighbour type we can thus re-use much of the neighbour-related code for both IPv4 and IPv6. For each nexthop, store its gateway IP and for nexthop group store the neighbour table used by its nexthops. Use this information throughout the code and remove assumption about the neighbour type. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
The neighbours' activity is currently dumped according to the ARP table's DELAY_PROBE time, but with the introduction of IPv6 offload we should set the interval according to the minimum between the ARP and ndisc tables. Signed-off-by: NArkadi Sharshvesky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
In addition to IPv4, periodically dump IPv6 neighbours and update the kernel about them. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
Update the register so that the active IPv6 neighbours could be dumped from the device's neighbour table. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
As with IPv4, listen to NEIGH_UPDATE events from the ndisc table and program relevant neighbours to the device's neighbour table. Note that neighbours with a link-local IP address aren't programmed, as packets with a link-local destination IP are trapped after LPM lookup and never reach the neighbour table. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
Update the register, so the IPv6 neighbours could be programmed to the device's neighbour table. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
When a netdev is configured with an IP address a router interface (RIF) should be configured for it in the device. Allow configuration of RIFs based on IPv6 address notifications as well as IPv4. Note that the RIF exists as long as an IP address is configured on the netdev, regardless of the address family. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
Up until now we only flooded broadcast packets to the router when an L3 interface was configured on top of a bridge. However, IPv6 Neighbour Discovery packets are trapped to the CPU inside the router and these can be sent with a multicast address. Flood unregistered multicast packets to the router port, so that relevant packets could be trapped there. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
Before we can start using IPv6, we need to trap certain control packets to the CPU. Among others, these include Neighbour Discovery, DHCP and neighbour misses. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
Enable IPv6 and IPv6 forwarding on router interfaces (RIFs), so that they will be able to receive and forward IPv6 traffic. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-