1. 02 7月, 2014 8 次提交
    • E
      inet: move ipv6only in sock_common · 9fe516ba
      Eric Dumazet 提交于
      When an UDP application switches from AF_INET to AF_INET6 sockets, we
      have a small performance degradation for IPv4 communications because of
      extra cache line misses to access ipv6only information.
      
      This can also be noticed for TCP listeners, as ipv6_only_sock() is also
      used from __inet_lookup_listener()->compute_score()
      
      This is magnified when SO_REUSEPORT is used.
      
      Move ipv6only into struct sock_common so that it is available at
      no extra cost in lookups.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fe516ba
    • J
      pktgen: RCU-ify "if_list" to remove lock in next_to_run() · 8788370a
      Jesper Dangaard Brouer 提交于
      The if_lock()/if_unlock() in next_to_run() adds a significant
      overhead, because its called for every packet in busy loop of
      pktgen_thread_worker().  (Thomas Graf originally pointed me
      at this lock problem).
      
      Removing these two "LOCK" operations should in theory save us approx
      16ns (8ns x 2), as illustrated below we do save 16ns when removing
      the locks and introducing RCU protection.
      
      Performance data with CLONE_SKB==100000, TX-size=512, rx-usecs=30:
       (single CPU performance, ixgbe 10Gbit/s, E5-2630)
       * Prev   : 5684009 pps --> 175.93ns (1/5684009*10^9)
       * RCU-fix: 6272204 pps --> 159.43ns (1/6272204*10^9)
       * Diff   : +588195 pps --> -16.50ns
      
      To understand this RCU patch, I describe the pktgen thread model
      below.
      
      In pktgen there is several kernel threads, but there is only one CPU
      running each kernel thread.  Communication with the kernel threads are
      done through some thread control flags.  This allow the thread to
      change data structures at a know synchronization point, see main
      thread func pktgen_thread_worker().
      
      Userspace changes are communicated through proc-file writes.  There
      are three types of changes, general control changes "pgctrl"
      (func:pgctrl_write), thread changes "kpktgend_X"
      (func:pktgen_thread_write), and interface config changes "etcX@N"
      (func:pktgen_if_write).
      
      Userspace "pgctrl" and "thread" changes are synchronized via the mutex
      pktgen_thread_lock, thus only a single userspace instance can run.
      The mutex is taken while the packet generator is running, by pgctrl
      "start".  Thus e.g. "add_device" cannot be invoked when pktgen is
      running/started.
      
      All "pgctrl" and all "thread" changes, except thread "add_device",
      communicate via the thread control flags.  The main problem is the
      exception "add_device", that modifies threads "if_list" directly.
      
      Fortunately "add_device" cannot be invoked while pktgen is running.
      But there exists a race between "rem_device_all" and "add_device"
      (which normally don't occur, because "rem_device_all" waits 125ms
      before returning). Background'ing "rem_device_all" and running
      "add_device" immediately allow the race to occur.
      
      The race affects the threads (list of devices) "if_list".  The if_lock
      is used for protecting this "if_list".  Other readers are given
      lock-free access to the list under RCU read sections.
      
      Note, interface config changes (via proc) can occur while pktgen is
      running, which worries me a bit.  I'm assuming proc_remove() takes
      appropriate locks, to assure no writers exists after proc_remove()
      finish.
      
      I've been running a script exercising the race condition (leading me
      to fix the proc_remove order), without any issues.  The script also
      exercises concurrent proc writes, while the interface config is
      getting removed.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8788370a
    • J
      pktgen: avoid expensive set_current_state() call in loop · baac167b
      Jesper Dangaard Brouer 提交于
      Avoid calling set_current_state() inside the busy-loop in
      pktgen_thread_worker().  In case of pkt_dev->delay, then it is still
      used/enabled in pktgen_xmit() via the spin() call.
      
      The set_current_state(TASK_INTERRUPTIBLE) uses a xchg, which implicit
      is LOCK prefixed.  I've measured the asm LOCK operation to take approx
      8ns on this E5-2630 CPU.  Performance increase corrolate with this
      measurement.
      
      Performance data with CLONE_SKB==100000, rx-usecs=30:
       (single CPU performance, ixgbe 10Gbit/s, E5-2630)
       * Prev:  5454050 pps --> 183.35ns (1/5454050*10^9)
       * Now:   5684009 pps --> 175.93ns (1/5684009*10^9)
       * Diff:  +229959 pps -->  -7.42ns
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      baac167b
    • J
      openvswitch: introduce rtnl ops stub · 5b9e7e16
      Jiri Pirko 提交于
      This stub now allows userspace to see IFLA_INFO_KIND for ovs master and
      IFLA_INFO_SLAVE_KIND for slave.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b9e7e16
    • J
      rtnetlink: allow to register ops without ops->setup set · b0ab2fab
      Jiri Pirko 提交于
      So far, it is assumed that ops->setup is filled up. But there might be
      case that ops might make sense even without ->setup. In that case,
      forbid to newlink and dellink.
      
      This allows to register simple rtnl link ops containing only ->kind.
      That allows consistent way of passing device kind (either device-kind or
      slave-kind) to userspace.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0ab2fab
    • Y
      net: fix some typos in comment · 9bf2b8c2
      Ying Xue 提交于
      In commit 37112105("net:
      QDISC_STATE_RUNNING dont need atomic bit ops") the
      __QDISC_STATE_RUNNING is renamed to __QDISC___STATE_RUNNING,
      but the old names existing in comment are not replaced with
      the new name completely.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9bf2b8c2
    • B
      ipv6: Allow accepting RA from local IP addresses. · d9333196
      Ben Greear 提交于
      This can be used in virtual networking applications, and
      may have other uses as well.  The option is disabled by
      default.
      
      A specific use case is setting up virtual routers, bridges, and
      hosts on a single OS without the use of network namespaces or
      virtual machines.  With proper use of ip rules, routing tables,
      veth interface pairs and/or other virtual interfaces,
      and applications that can bind to interfaces and/or IP addresses,
      it is possibly to create one or more virtual routers with multiple
      hosts attached.  The host interfaces can act as IPv6 systems,
      with radvd running on the ports in the virtual routers.  With the
      option provided in this patch enabled, those hosts can now properly
      obtain IPv6 addresses from the radvd.
      Signed-off-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9333196
    • B
      ipv6: Add more debugging around accept-ra logic. · f2a762d8
      Ben Greear 提交于
      This is disabled by default, just like similar debug info
      already in this module.  But, makes it easier to find out
      why RA is not being accepted when debugging strange behaviour.
      Signed-off-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2a762d8
  2. 30 6月, 2014 1 次提交
  3. 28 6月, 2014 26 次提交
  4. 26 6月, 2014 5 次提交