1. 17 5月, 2014 9 次提交
    • J
      openvswitch: Per NUMA node flow stats. · 63e7959c
      Jarno Rajahalme 提交于
      Keep kernel flow stats for each NUMA node rather than each (logical)
      CPU.  This avoids using the per-CPU allocator and removes most of the
      kernel-side OVS locking overhead otherwise on the top of perf reports
      and allows OVS to scale better with higher number of threads.
      
      With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup
      rate doubles on a server with two hyper-threaded physical CPUs (16
      logical cores each) compared to the current OVS master.  Tested with
      non-trivial flow table with a TCP port match rule forcing all new
      connections with unique port numbers to OVS userspace.  The IP
      addresses are still wildcarded, so the kernel flows are not considered
      as exact match 5-tuple flows.  This type of flows can be expected to
      appear in large numbers as the result of more effective wildcarding
      made possible by improvements in OVS userspace flow classifier.
      
      Perf results for this test (master):
      
      Events: 305K cycles
      +   8.43%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
      +   5.64%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
      +   4.75%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
      +   3.32%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
      +   2.61%     ovs-vswitchd  [kernel.kallsyms]   [k] pcpu_alloc_area
      +   2.19%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
      +   2.03%          swapper  [kernel.kallsyms]   [k] intel_idle
      +   1.84%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
      +   1.64%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
      +   1.58%     ovs-vswitchd  libc-2.15.so        [.] 0x7f4e6
      +   1.07%     ovs-vswitchd  [kernel.kallsyms]   [k] memset
      +   1.03%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
      +   0.92%          swapper  [kernel.kallsyms]   [k] __ticket_spin_lock
      ...
      
      And after this patch:
      
      Events: 356K cycles
      +   6.85%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
      +   4.63%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
      +   3.06%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
      +   2.81%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
      +   2.51%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
      +   2.27%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
      +   1.84%     ovs-vswitchd  libc-2.15.so        [.] 0x15d30f
      +   1.74%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
      +   1.47%          swapper  [kernel.kallsyms]   [k] intel_idle
      +   1.34%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask
      +   1.33%     ovs-vswitchd  ovs-vswitchd        [.] rule_actions_unref
      +   1.16%     ovs-vswitchd  ovs-vswitchd        [.] hindex_node_with_hash
      +   1.16%     ovs-vswitchd  ovs-vswitchd        [.] do_xlate_actions
      +   1.09%     ovs-vswitchd  ovs-vswitchd        [.] ofproto_rule_ref
      +   1.01%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
      ...
      
      There is a small increase in kernel spinlock overhead due to the same
      spinlock being shared between multiple cores of the same physical CPU,
      but that is barely visible in the netperf TCP_CRR test performance
      (maybe ~1% performance drop, hard to tell exactly due to variance in
      the test results), when testing for kernel module throughput (with no
      userspace activity, handful of kernel flows).
      
      On flow setup, a single stats instance is allocated (for the NUMA node
      0).  As CPUs from multiple NUMA nodes start updating stats, new
      NUMA-node specific stats instances are allocated.  This allocation on
      the packet processing code path is made to never block or look for
      emergency memory pools, minimizing the allocation latency.  If the
      allocation fails, the existing preallocated stats instance is used.
      Also, if only CPUs from one NUMA-node are updating the preallocated
      stats instance, no additional stats instances are allocated.  This
      eliminates the need to pre-allocate stats instances that will not be
      used, also relieving the stats reader from the burden of reading stats
      that are never used.
      Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      63e7959c
    • J
      openvswitch: Remove 5-tuple optimization. · 23dabf88
      Jarno Rajahalme 提交于
      The 5-tuple optimization becomes unnecessary with a later per-NUMA
      node stats patch.  Remove it first to make the changes easier to
      grasp.
      Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      23dabf88
    • J
      openvswitch: Use ether_addr_copy · 8c63ff09
      Joe Perches 提交于
      It's slightly smaller/faster for some architectures.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      8c63ff09
    • J
      openvswitch: flow_netlink: Use pr_fmt to OVS_NLERR output · 2235ad1c
      Joe Perches 提交于
      Add "openvswitch: " prefix to OVS_NLERR output
      to match the other OVS_NLERR output of datapath.c
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      2235ad1c
    • J
      openvswitch: Use net_ratelimit in OVS_NLERR · 1815a883
      Joe Perches 提交于
      Each use of pr_<level>_once has a per-site flag.
      
      Some of the OVS_NLERR messages look as if seeing them
      multiple times could be useful, so use net_ratelimit()
      instead of pr_info_once.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      1815a883
    • D
      openvswitch: Added (unsigned long long) cast in printf · cc23ebf3
      Daniele Di Proietto 提交于
      This is necessary, since u64 is not unsigned long long
      in all architectures: u64 could be also uint64_t.
      Signed-off-by: NDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      cc23ebf3
    • D
      openvswitch: avoid cast-qual warning in vport_priv · 07dc0602
      Daniele Di Proietto 提交于
      This function must cast a const value to a non const value.
      By adding an uintptr_t cast the warning is suppressed.
      To avoid the cast (proper solution) several function signatures
      must be changed.
      Signed-off-by: NDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      07dc0602
    • D
      openvswitch: avoid warnings in vport_from_priv · d0b4da13
      Daniele Di Proietto 提交于
      This change, firstly, avoids declaring the formal parameter const,
      since it is treated as non const. (to avoid -Wcast-qual)
      Secondly, it cast the pointer from void* to u8*, since it is used
      in arithmetic (to avoid -Wpointer-arith)
      Signed-off-by: NDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      d0b4da13
    • D
      openvswitch: use const in some local vars and casts · 7085130b
      Daniele Di Proietto 提交于
      In few functions, const formal parameters are assigned or cast to
      non-const.
      These changes suppress warnings if compiled with -Wcast-qual.
      Signed-off-by: NDaniele Di Proietto <daniele.di.proietto@gmail.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      7085130b
  2. 16 5月, 2014 25 次提交
  3. 15 5月, 2014 6 次提交