1. 17 11月, 2008 7 次提交
    • G
      dccp: Deprecate old setsockopt framework · 49aebc66
      Gerrit Renker 提交于
      The previous setsockopt interface, which passed socket options via struct
      dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
      ugly code since the old approach did not distinguish between NN and SP values.
      
      This patch removes the old setsockopt interface and replaces it with two new
      functions to register NN/SP values for feature negotiation. 
      These are essentially wrappers around the internal __feat_register functions,
      with checking added to avoid
      
       * wrong usage (type);
       * changing values while the connection is in progress.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49aebc66
    • G
      dccp: Mechanism to resolve CCID dependencies · 0c116839
      Gerrit Renker 提交于
      This adds a hook to resolve features whose value depends on the choice of
      CCID. It is done at the server since it can only be done after the CCID
      values have been negotiated; i.e. the client will add its CCID preference
      list on the Change options sent in the Request, which will be reconciled
      with the local preference list of the server.
      
      The concept is documented on
      http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
      				implementation_notes.html#ccid_dependencies
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c116839
    • A
      net: use %pF for /proc/net/ptype · 908cd2da
      Alexey Dobriyan 提交于
      Technically, patch changes format for modules, but I think nobody cares.
      
      	-86dd          :ipv6:ipv6_rcv+0x0
      	+86dd          ipv6_rcv+0x0/0x400 [ipv6]
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      908cd2da
    • E
      net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls · 3ab5aee7
      Eric Dumazet 提交于
      RCU was added to UDP lookups, using a fast infrastructure :
      - sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the
        price of call_rcu() at freeing time.
      - hlist_nulls permits to use few memory barriers.
      
      This patch uses same infrastructure for TCP/DCCP established
      and timewait sockets.
      
      Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications
      using short lived TCP connections. A followup patch, converting
      rwlocks to spinlocks will even speedup this case.
      
      __inet_lookup_established() is pretty fast now we dont have to
      dirty a contended cache line (read_lock/read_unlock)
      
      Only established and timewait hashtable are converted to RCU
      (bind table and listen table are still using traditional locking)
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ab5aee7
    • E
      udp: Use hlist_nulls in UDP RCU code · 88ab1932
      Eric Dumazet 提交于
      This is a straightforward patch, using hlist_nulls infrastructure.
      
      RCUification already done on UDP two weeks ago.
      
      Using hlist_nulls permits us to avoid some memory barriers, both
      at lookup time and delete time.
      
      Patch is large because it adds new macros to include/net/sock.h.
      These macros will be used by TCP & DCCP in next patch.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88ab1932
    • B
      TPROXY: implemented IP_RECVORIGDSTADDR socket option · e8b2dfe9
      Balazs Scheidler 提交于
      In case UDP traffic is redirected to a local UDP socket,
      the originally addressed destination address/port
      cannot be recovered with the in-kernel tproxy.
      
      This patch adds an IP_RECVORIGDSTADDR sockopt that enables
      a IP_ORIGDSTADDR ancillary message in recvmsg(). This
      ancillary message contains the original destination address/port
      of the packet being received.
      Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8b2dfe9
    • B
      ipv4: Fix ARP behavior with many mac-vlans · 8164f1b7
      Ben Greear 提交于
      Ben Greear wrote:
      > I have 500 mac-vlans on a system talking to 500 other
      > mac-vlans.  My problem is that the arp-table gets extremely
      > huge because every time an arp-request comes in on all mac-vlans,
      > a stale arp entry is added for each mac-vlan.  I have filtering
      > turned on, but that doesn't help because the neigh_event_ns call
      > below will cause a stale neighbor entry to be created regardless
      > of whether a replay will be sent or not.
      > Maybe the neigh_event code should be below the checks for dont_send,
      > and only create check neigh_event_ns if we are !dont_send?
      
      The attached patch makes it work much better for me.  The patch
      will cause the code to NOT create a stale neighbor entry if we
      are not going to respond to the ARP request.  The old code
      *would* create a stale entry even if we are not going to respond.
      Signed-off-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8164f1b7
  2. 14 11月, 2008 2 次提交
    • E
      net: speedup dst_release() · ef711cf1
      Eric Dumazet 提交于
      During tbench/oprofile sessions, I found that dst_release() was in third position.
      
      CPU: Core 2, speed 2999.68 MHz (estimated)
      Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
      samples  %        symbol name
      483726    9.0185  __copy_user_zeroing_intel
      191466    3.5697  __copy_user_intel
      185475    3.4580  dst_release
      175114    3.2648  ip_queue_xmit
      153447    2.8608  tcp_sendmsg
      108775    2.0280  tcp_recvmsg
      102659    1.9140  sysenter_past_esp
      101450    1.8914  tcp_current_mss
      95067     1.7724  __copy_from_user_ll
      86531     1.6133  tcp_transmit_skb
      
      Of course, all CPUS fight on the dst_entry associated with 127.0.0.1 
      
      Instead of first checking the refcount value, then decrement it,
      we use atomic_dec_return() to help CPU to make the right memory transaction
      (ie getting the cache line in exclusive mode)
      
      dst_release() is now at the fifth position, and tbench a litle bit faster ;)
      
      CPU: Core 2, speed 3000.1 MHz (estimated)
      Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
      samples  %        symbol name
      647107    8.8072  __copy_user_zeroing_intel
      258840    3.5229  ip_queue_xmit
      258302    3.5155  __copy_user_intel
      209629    2.8531  tcp_sendmsg
      165632    2.2543  dst_release
      149232    2.0311  tcp_current_mss
      147821    2.0119  tcp_recvmsg
      137893    1.8767  sysenter_past_esp
      127473    1.7349  __copy_from_user_ll
      121308    1.6510  ip_finish_output
      118510    1.6129  tcp_transmit_skb
      109295    1.4875  tcp_v4_rcv
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef711cf1
    • J
      pkt_sched: Remove qdisc->ops->requeue() etc. · f30ab418
      Jarek Poplawski 提交于
      After implementing qdisc->ops->peek() and changing sch_netem into
      classless qdisc there are no more qdisc->ops->requeue() users. This
      patch removes this method with its wrappers (qdisc_requeue()), and
      also unused qdisc->requeue structure. There are a few minor fixes of
      warnings (htb_enqueue()) and comments btw.
      
      The idea to kill ->requeue() and a similar patch were first developed
      by David S. Miller.
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f30ab418
  3. 13 11月, 2008 2 次提交
  4. 12 11月, 2008 8 次提交
  5. 11 11月, 2008 21 次提交