提交 · b95bf1e0ea724f4004faae0cc6685c06379157c3 · openanolis / cloud-kernel

11 12月, 2014 1 次提交

net: replace remaining users of arch_fast_hash with jhash · 87545899

由 Daniel Borkmann 提交于 12月 10, 2014

This patch effectively reverts commit 500f8087 ("net: ovs: use CRC32
accelerated flow hash if available"), and other remaining arch_fast_hash()
users such as from nfsd via commit 6282cd56 ("NFSD: Don't hand out
delegations for 30 seconds after recalling them.") where it has been used
as a hash function for bloom filtering.

While we think that these users are actually not much of concern, it has
been requested to remove the arch_fast_hash() library bits that arose
from [1] entirely as per recent discussion [2]. The main argument is that
using it as a hash may introduce bias due to its linearity (see avalanche
criterion) and thus makes it less clear (though we tried to document that)
when this security/performance trade-off is actually acceptable for a
general purpose library function.

Lets therefore avoid any further confusion on this matter and remove it to
prevent any future accidental misuse of it. For the time being, this is
going to make hashing of flow keys a bit more expensive in the ovs case,
but future work could reevaluate a different hashing discipline.

  [1] https://patchwork.ozlabs.org/patch/299369/
  [2] https://patchwork.ozlabs.org/patch/418756/

Cc: Neil Brown <neilb@suse.de>
Cc: Francesco Fusco <fusco@ntop.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87545899

10 11月, 2014 1 次提交

openvswitch: Constify various function arguments · 12eb18f7

由 Thomas Graf 提交于 11月 06, 2014

Help produce better optimized code.
Signed-off-by: NThomas Graf <tgraf@noironetworks.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>

12eb18f7

06 11月, 2014 1 次提交

openvswitch: Move table destroy to dp-rcu callback. · 9b996e54

由 Pravin B Shelar 提交于 5月 06, 2014

Ths simplifies flow-table-destroy API. No need to pass explicit
parameter about context.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NThomas Graf <tgraf@redhat.com>

9b996e54

01 7月, 2014 1 次提交

openvswitch: Use exact lookup for flow_get and flow_del. · 4a46b24e

由 Alex Wang 提交于 6月 30, 2014

Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath.  And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.

This commit fixes the bug by making the kernel flow_del and flow_get
logic check all masks in that case.

Introduced by 03f0d916 (openvswitch: Mega flow implementation).
Signed-off-by: NAlex Wang <alexw@nicira.com>
Acked-by: NAndy Zhou <azhou@nicira.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>

4a46b24e

23 5月, 2014 2 次提交

openvswitch: Fix typo. · eb072659

由 Jarno Rajahalme 提交于 5月 05, 2014

Incorrect struct name was confusing, even though otherwise
inconsequental.
Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>

eb072659

openvswitch: Make flow mask removal symmetric. · 56c19868

由 Jarno Rajahalme 提交于 5月 05, 2014

Masks are inserted when flows are inserted to the table, so it is
logical to correspondingly remove masks when flows are removed from
the table, in ovs_flow_table_remove().

This allows ovs_flow_free() to be called without locking, which will
be used by later patches.
Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>

56c19868

17 5月, 2014 3 次提交

openvswitch: Per NUMA node flow stats. · 63e7959c

由 Jarno Rajahalme 提交于 3月 27, 2014

Keep kernel flow stats for each NUMA node rather than each (logical)
CPU.  This avoids using the per-CPU allocator and removes most of the
kernel-side OVS locking overhead otherwise on the top of perf reports
and allows OVS to scale better with higher number of threads.

With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup
rate doubles on a server with two hyper-threaded physical CPUs (16
logical cores each) compared to the current OVS master.  Tested with
non-trivial flow table with a TCP port match rule forcing all new
connections with unique port numbers to OVS userspace.  The IP
addresses are still wildcarded, so the kernel flows are not considered
as exact match 5-tuple flows.  This type of flows can be expected to
appear in large numbers as the result of more effective wildcarding
made possible by improvements in OVS userspace flow classifier.

Perf results for this test (master):

Events: 305K cycles
+   8.43%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
+   5.64%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
+   4.75%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
+   3.32%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
+   2.61%     ovs-vswitchd  [kernel.kallsyms]   [k] pcpu_alloc_area
+   2.19%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
+   2.03%          swapper  [kernel.kallsyms]   [k] intel_idle
+   1.84%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
+   1.64%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
+   1.58%     ovs-vswitchd  libc-2.15.so        [.] 0x7f4e6
+   1.07%     ovs-vswitchd  [kernel.kallsyms]   [k] memset
+   1.03%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
+   0.92%          swapper  [kernel.kallsyms]   [k] __ticket_spin_lock
...

And after this patch:

Events: 356K cycles
+   6.85%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
+   4.63%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
+   3.06%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
+   2.81%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
+   2.51%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
+   2.27%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
+   1.84%     ovs-vswitchd  libc-2.15.so        [.] 0x15d30f
+   1.74%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
+   1.47%          swapper  [kernel.kallsyms]   [k] intel_idle
+   1.34%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask
+   1.33%     ovs-vswitchd  ovs-vswitchd        [.] rule_actions_unref
+   1.16%     ovs-vswitchd  ovs-vswitchd        [.] hindex_node_with_hash
+   1.16%     ovs-vswitchd  ovs-vswitchd        [.] do_xlate_actions
+   1.09%     ovs-vswitchd  ovs-vswitchd        [.] ofproto_rule_ref
+   1.01%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
...

There is a small increase in kernel spinlock overhead due to the same
spinlock being shared between multiple cores of the same physical CPU,
but that is barely visible in the netperf TCP_CRR test performance
(maybe ~1% performance drop, hard to tell exactly due to variance in
the test results), when testing for kernel module throughput (with no
userspace activity, handful of kernel flows).

On flow setup, a single stats instance is allocated (for the NUMA node
0).  As CPUs from multiple NUMA nodes start updating stats, new
NUMA-node specific stats instances are allocated.  This allocation on
the packet processing code path is made to never block or look for
emergency memory pools, minimizing the allocation latency.  If the
allocation fails, the existing preallocated stats instance is used.
Also, if only CPUs from one NUMA-node are updating the preallocated
stats instance, no additional stats instances are allocated.  This
eliminates the need to pre-allocate stats instances that will not be
used, also relieving the stats reader from the burden of reading stats
that are never used.
Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

63e7959c

openvswitch: Remove 5-tuple optimization. · 23dabf88

由 Jarno Rajahalme 提交于 3月 27, 2014

The 5-tuple optimization becomes unnecessary with a later per-NUMA
node stats patch.  Remove it first to make the changes easier to
grasp.
Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

23dabf88

openvswitch: use const in some local vars and casts · 7085130b

由 Daniele Di Proietto 提交于 1月 23, 2014

In few functions, const formal parameters are assigned or cast to
non-const.
These changes suppress warnings if compiled with -Wcast-qual.
Signed-off-by: NDaniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

7085130b

05 2月, 2014 2 次提交

openvswitch: Fix ovs_flow_free() ovs-lock assert. · e4c6d759

由 Pravin B Shelar 提交于 1月 31, 2014

ovs_flow_free() is not called under ovs-lock during packet
execute path (ovs_packet_cmd_execute()). Since packet execute
does not touch flow->mask, there is no need to take that
lock either. So move assert in case where flow->mask is checked.

Found by code inspection.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

e4c6d759

openvswitch: Fix kernel panic on ovs_flow_free · e80857cc

由 Andy Zhou 提交于 1月 21, 2014

Both mega flow mask's reference counter and per flow table mask list
should only be accessed when holding ovs_mutex() lock. However
this is not true with ovs_flow_table_flush(). The patch fixes this bug.
Reported-by: NJoe Stringer <joestringer@nicira.com>
Signed-off-by: NAndy Zhou <azhou@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

e80857cc

10 1月, 2014 1 次提交

openvswitch: Use kmem_cache_free() instead of kfree() · ece37c87

由 Wei Yongjun 提交于 1月 08, 2014

memory allocated by kmem_cache_alloc() should be freed using
kmem_cache_free(), not kfree().

Fixes: e298e505 ('openvswitch: Per cpu flow stats.')
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ece37c87

07 1月, 2014 6 次提交

openvswitch: remove duplicated include from flow_table.c · 5f03f47c

由 Wei Yongjun 提交于 12月 16, 2013

Remove duplicated include.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NJesse Gross <jesse@nicira.com>

5f03f47c

net: ovs: use kfree_rcu instead of rcu_free_{sw_flow_mask_cb,acts_callback} · 11d6c461

由 Daniel Borkmann 提交于 12月 10, 2013

As we're only doing a kfree() anyway in the RCU callback, we can
simply use kfree_rcu, which does the same job, and remove the
function rcu_free_sw_flow_mask_cb() and rcu_free_acts_callback().
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

11d6c461

openvswitch: Per cpu flow stats. · e298e505

由 Pravin B Shelar 提交于 10月 29, 2013

With mega flow implementation ovs flow can be shared between
multiple CPUs which makes stats updates highly contended
operation. This patch uses per-CPU stats in cases where a flow
is likely to be shared (if there is a wildcard in the 5-tuple
and therefore likely to be spread by RSS). In other situations,
it uses the current strategy, saving memory and allocation time.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

e298e505

openvswitch: Silence RCU lockdep checks from flow lookup. · 663efa36

由 Jesse Gross 提交于 12月 03, 2013

Flow lookup can happen either in packet processing context or userspace
context but it was annotated as requiring RCU read lock to be held. This
also allows OVS mutex to be held without causing warnings.
Reported-by: NJustin Pettit <jpettit@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>
Reviewed-by: NThomas Graf <tgraf@redhat.com>

663efa36

openvswitch: Change ovs_flow_tbl_lookup_xx() APIs · 5bb50632

由 Andy Zhou 提交于 11月 25, 2013

API changes only for code readability. No functional chnages.

This patch removes the underscored version. Added a new API
ovs_flow_tbl_lookup_stats() that returns the n_mask_hits.

Reported by: Ben Pfaff <blp@nicira.com>
Reviewed-by: NThomas Graf <tgraf@redhat.com>
Signed-off-by: NAndy Zhou <azhou@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

5bb50632

openvswitch: Correct comment. · d1211908

由 Ben Pfaff 提交于 11月 25, 2013

Signed-off-by: NBen Pfaff <blp@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

d1211908

18 12月, 2013 1 次提交

net: ovs: use CRC32 accelerated flow hash if available · 500f8087

由 Francesco Fusco 提交于 12月 12, 2013

Currently OVS uses jhash2() for calculating flow hashes in its
internal flow_hash() function. The performance of the flow_hash()
function is critical, as the input data can be hundreds of bytes
long.

OVS is largely deployed in x86_64 based datacenters.  Therefore,
we argue that the performance critical fast path of OVS should
exploit underlying CPU features in order to reduce the per packet
processing costs. We replace jhash2 with the hash implementation
provided by the kernel hash lib, which exploits the crc32l
instruction to achieve high performance

Our patch greatly reduces the hash footprint from ~200 cycles of
jhash2() to around ~90 cycles in case of ovs_flow_hash_crc()
(measured with rdtsc over maximum length flow keys on an i7 Intel
CPU).

Additionally, we wrote a microbenchmark to stress the flow table
performance. The benchmark inserts random flows into the flow
hash and then performs lookups. Our hash deployed on a CRC32
capable CPU reduces the lookup for 1000 flows, 100 masks from
~10,100us to ~6,700us, for example.

Thus, simply use the newly introduced arch_fast_hash2() as a
drop-in replacement.
Signed-off-by: NFrancesco Fusco <ffusco@redhat.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NThomas Graf <tgraf@redhat.com>
Acked-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

500f8087

02 11月, 2013 1 次提交

openvswitch: Use flow hash during flow lookup operation. · 8ddd0946

由 Pravin B Shelar 提交于 10月 29, 2013

Flow->hash can be used to detect hash collisions and avoid flow key
compare in flow lookup.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

8ddd0946

23 10月, 2013 1 次提交

openvswitch: collect mega flow mask stats · 1bd7116f

由 Andy Zhou 提交于 10月 22, 2013

Collect mega flow mask stats. ovs-dpctl show command can be used to
display them for debugging and performance tuning.
Signed-off-by: NAndy Zhou <azhou@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

1bd7116f

04 10月, 2013 3 次提交

openvswitch: Simplify mega-flow APIs. · 618ed0c8

由 Pravin B Shelar 提交于 10月 04, 2013

Hides mega-flow implementation in flow_table.c rather than
datapath.c.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

618ed0c8

openvswitch: Move mega-flow list out of rehashing struct. · b637e498

由 Pravin B Shelar 提交于 10月 04, 2013

ovs-flow rehash does not touch mega flow list. Following patch
moves it dp struct datapath.  Avoid one extra indirection for
accessing mega-flow list head on every packet receive.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

b637e498

openvswitch: Restructure datapath.c and flow.c · e6445719

由 Pravin B Shelar 提交于 10月 03, 2013

Over the time datapath.c and flow.c has became pretty large files.
Following patch restructures functionality of component into three
different components:

flow.c: contains flow extract.
flow_netlink.c: netlink flow api.
flow_table.c: flow table api.

This patch restructures code without changing logic.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

e6445719

openanolis / cloud-kernel 12 个月 前同步成功

openanolis / cloud-kernel
12 个月前同步成功