- 17 5月, 2014 2 次提交
-
-
由 Jarno Rajahalme 提交于
Keep kernel flow stats for each NUMA node rather than each (logical) CPU. This avoids using the per-CPU allocator and removes most of the kernel-side OVS locking overhead otherwise on the top of perf reports and allows OVS to scale better with higher number of threads. With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup rate doubles on a server with two hyper-threaded physical CPUs (16 logical cores each) compared to the current OVS master. Tested with non-trivial flow table with a TCP port match rule forcing all new connections with unique port numbers to OVS userspace. The IP addresses are still wildcarded, so the kernel flows are not considered as exact match 5-tuple flows. This type of flows can be expected to appear in large numbers as the result of more effective wildcarding made possible by improvements in OVS userspace flow classifier. Perf results for this test (master): Events: 305K cycles + 8.43% ovs-vswitchd [kernel.kallsyms] [k] mutex_spin_on_owner + 5.64% ovs-vswitchd [kernel.kallsyms] [k] __ticket_spin_lock + 4.75% ovs-vswitchd ovs-vswitchd [.] find_match_wc + 3.32% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_lock + 2.61% ovs-vswitchd [kernel.kallsyms] [k] pcpu_alloc_area + 2.19% ovs-vswitchd ovs-vswitchd [.] flow_hash_in_minimask_range + 2.03% swapper [kernel.kallsyms] [k] intel_idle + 1.84% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_unlock + 1.64% ovs-vswitchd ovs-vswitchd [.] classifier_lookup + 1.58% ovs-vswitchd libc-2.15.so [.] 0x7f4e6 + 1.07% ovs-vswitchd [kernel.kallsyms] [k] memset + 1.03% netperf [kernel.kallsyms] [k] __ticket_spin_lock + 0.92% swapper [kernel.kallsyms] [k] __ticket_spin_lock ... And after this patch: Events: 356K cycles + 6.85% ovs-vswitchd ovs-vswitchd [.] find_match_wc + 4.63% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_lock + 3.06% ovs-vswitchd [kernel.kallsyms] [k] __ticket_spin_lock + 2.81% ovs-vswitchd ovs-vswitchd [.] flow_hash_in_minimask_range + 2.51% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_unlock + 2.27% ovs-vswitchd ovs-vswitchd [.] classifier_lookup + 1.84% ovs-vswitchd libc-2.15.so [.] 0x15d30f + 1.74% ovs-vswitchd [kernel.kallsyms] [k] mutex_spin_on_owner + 1.47% swapper [kernel.kallsyms] [k] intel_idle + 1.34% ovs-vswitchd ovs-vswitchd [.] flow_hash_in_minimask + 1.33% ovs-vswitchd ovs-vswitchd [.] rule_actions_unref + 1.16% ovs-vswitchd ovs-vswitchd [.] hindex_node_with_hash + 1.16% ovs-vswitchd ovs-vswitchd [.] do_xlate_actions + 1.09% ovs-vswitchd ovs-vswitchd [.] ofproto_rule_ref + 1.01% netperf [kernel.kallsyms] [k] __ticket_spin_lock ... There is a small increase in kernel spinlock overhead due to the same spinlock being shared between multiple cores of the same physical CPU, but that is barely visible in the netperf TCP_CRR test performance (maybe ~1% performance drop, hard to tell exactly due to variance in the test results), when testing for kernel module throughput (with no userspace activity, handful of kernel flows). On flow setup, a single stats instance is allocated (for the NUMA node 0). As CPUs from multiple NUMA nodes start updating stats, new NUMA-node specific stats instances are allocated. This allocation on the packet processing code path is made to never block or look for emergency memory pools, minimizing the allocation latency. If the allocation fails, the existing preallocated stats instance is used. Also, if only CPUs from one NUMA-node are updating the preallocated stats instance, no additional stats instances are allocated. This eliminates the need to pre-allocate stats instances that will not be used, also relieving the stats reader from the burden of reading stats that are never used. Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com> Acked-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
由 Jarno Rajahalme 提交于
The 5-tuple optimization becomes unnecessary with a later per-NUMA node stats patch. Remove it first to make the changes easier to grasp. Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
- 05 2月, 2014 1 次提交
-
-
由 Andy Zhou 提交于
Both mega flow mask's reference counter and per flow table mask list should only be accessed when holding ovs_mutex() lock. However this is not true with ovs_flow_table_flush(). The patch fixes this bug. Reported-by: NJoe Stringer <joestringer@nicira.com> Signed-off-by: NAndy Zhou <azhou@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
- 07 1月, 2014 2 次提交
-
-
由 Pravin B Shelar 提交于
With mega flow implementation ovs flow can be shared between multiple CPUs which makes stats updates highly contended operation. This patch uses per-CPU stats in cases where a flow is likely to be shared (if there is a wildcard in the 5-tuple and therefore likely to be spread by RSS). In other situations, it uses the current strategy, saving memory and allocation time. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
由 Andy Zhou 提交于
API changes only for code readability. No functional chnages. This patch removes the underscored version. Added a new API ovs_flow_tbl_lookup_stats() that returns the n_mask_hits. Reported by: Ben Pfaff <blp@nicira.com> Reviewed-by: NThomas Graf <tgraf@redhat.com> Signed-off-by: NAndy Zhou <azhou@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
- 23 10月, 2013 1 次提交
-
-
由 Andy Zhou 提交于
Collect mega flow mask stats. ovs-dpctl show command can be used to display them for debugging and performance tuning. Signed-off-by: NAndy Zhou <azhou@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
- 04 10月, 2013 3 次提交
-
-
由 Pravin B Shelar 提交于
Hides mega-flow implementation in flow_table.c rather than datapath.c. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
由 Pravin B Shelar 提交于
ovs-flow rehash does not touch mega flow list. Following patch moves it dp struct datapath. Avoid one extra indirection for accessing mega-flow list head on every packet receive. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
由 Pravin B Shelar 提交于
Over the time datapath.c and flow.c has became pretty large files. Following patch restructures functionality of component into three different components: flow.c: contains flow extract. flow_netlink.c: netlink flow api. flow_table.c: flow table api. This patch restructures code without changing logic. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-