1. 11 6月, 2012 4 次提交
    • D
      8e773277
    • D
      inet: Add family scope inetpeer flushes. · b48c80ec
      David S. Miller 提交于
      This implementation can deal with having many inetpeer roots, which is
      a necessary prerequisite for per-FIB table rooted peer tables.
      
      Each family (AF_INET, AF_INET6) has a sequence number which we bump
      when we get a family invalidation request.
      
      Each peer lookup cheaply checks whether the flush sequence of the
      root we are using is out of date, and if so flushes it and updates
      the sequence number.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b48c80ec
    • D
      ipv4: Kill ip_rt_frag_needed(). · 46517008
      David S. Miller 提交于
      There is zero point to this function.
      
      It's only real substance is to perform an extremely outdated BSD4.2
      ICMP check, which we can safely remove.  If you really have a MTU
      limited link being routed by a BSD4.2 derived system, here's a nickel
      go buy yourself a real router.
      
      The other actions of ip_rt_frag_needed(), checking and conditionally
      updating the peer, are done by the per-protocol handlers of the ICMP
      event.
      
      TCP, UDP, et al. have a handler which will receive this event and
      transmit it back into the associated route via dst_ops->update_pmtu().
      
      This simplification is important, because it eliminates the one place
      where we do not have a proper route context in which to make an
      inetpeer lookup.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46517008
    • D
      inet: Hide route peer accesses behind helpers. · 97bab73f
      David S. Miller 提交于
      We encode the pointer(s) into an unsigned long with one state bit.
      
      The state bit is used so we can store the inetpeer tree root to use
      when resolving the peer later.
      
      Later the peer roots will be per-FIB table, and this change works to
      facilitate that.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97bab73f
  2. 10 6月, 2012 6 次提交
  3. 09 6月, 2012 5 次提交
    • D
      tcp: Get rid of inetpeer special cases. · 4670fd81
      David S. Miller 提交于
      The get_peer method TCP uses is full of special cases that make no
      sense accommodating, and it also gets in the way of doing more
      reasonable things here.
      
      First of all, if the socket doesn't have a usable cached route, there
      is no sense in trying to optimize timewait recycling.
      
      Likewise for the case where we have IP options, such as SRR enabled,
      that make the IP header destination address (and thus the destination
      address of the route key) differ from that of the connection's
      destination address.
      
      Just return a NULL peer in these cases, and thus we're also able to
      get rid of the clumsy inetpeer release logic.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4670fd81
    • D
      inet: Create and use rt{,6}_get_peer_create(). · fbfe95a4
      David S. Miller 提交于
      There's a lot of places that open-code rt{,6}_get_peer() only because
      they want to set 'create' to one.  So add an rt{,6}_get_peer_create()
      for their sake.
      
      There were also a few spots open-coding plain rt{,6}_get_peer() and
      those are transformed here as well.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbfe95a4
    • E
      af_unix: speedup /proc/net/unix · 7123aaa3
      Eric Dumazet 提交于
      /proc/net/unix has quadratic behavior, and can hold unix_table_lock for
      a while if high number of unix sockets are alive. (90 ms for 200k
      sockets...)
      
      We already have a hash table, so its quite easy to use it.
      
      Problem is unbound sockets are still hashed in a single hash slot
      (unix_socket_table[UNIX_HASH_TABLE])
      
      This patch also spreads unbound sockets to 256 hash slots, to speedup
      both /proc/net/unix and unix_diag.
      
      Time to read /proc/net/unix with 200k unix sockets :
      (time dd if=/proc/net/unix of=/dev/null bs=4k)
      
      before : 520 secs
      after : 2 secs
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7123aaa3
    • G
      inetpeer: add parameter net for inet_getpeer_v4,v6 · 54db0cc2
      Gao feng 提交于
      add struct net as a parameter of inet_getpeer_v[4,6],
      use net to replace &init_net.
      
      and modify some places to provide net for inet_getpeer_v[4,6]
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54db0cc2
    • G
      inetpeer: add namespace support for inetpeer · c8a627ed
      Gao feng 提交于
      now inetpeer doesn't support namespace,the information will
      be leaking across namespace.
      
      this patch move the global vars v4_peers and v6_peers to
      netns_ipv4 and netns_ipv6 as a field peers.
      
      add struct pernet_operations inetpeer_ops to initial pernet
      inetpeer data.
      
      and change family_to_base and inet_getpeer to support namespace.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8a627ed
  4. 08 6月, 2012 2 次提交
  5. 07 6月, 2012 1 次提交
  6. 06 6月, 2012 1 次提交
  7. 05 6月, 2012 10 次提交
  8. 04 6月, 2012 6 次提交
  9. 03 6月, 2012 1 次提交
    • L
      tty: Revert the tty locking series, it needs more work · f309532b
      Linus Torvalds 提交于
      This reverts the tty layer change to use per-tty locking, because it's
      not correct yet, and fixing it will require some more deep surgery.
      
      The main revert is d29f3ef3 ("tty_lock: Localise the lock"), but
      there are several smaller commits that built upon it, they also get
      reverted here. The list of reverted commits is:
      
        fde86d31 - tty: add lockdep annotations
        8f6576ad - tty: fix ldisc lock inversion trace
        d3ca8b64 - pty: Fix lock inversion
        b1d679af - tty: drop the pty lock during hangup
        abcefe5f - tty/amiserial: Add missing argument for tty_unlock()
        fd11b42e - cris: fix missing tty arg in wait_event_interruptible_tty call
        d29f3ef3 - tty_lock: Localise the lock
      
      The revert had a trivial conflict in the 68360serial.c staging driver
      that got removed in the meantime.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f309532b
  10. 02 6月, 2012 2 次提交
    • E
      tcp: reflect SYN queue_mapping into SYNACK packets · fff32699
      Eric Dumazet 提交于
      While testing how linux behaves on SYNFLOOD attack on multiqueue device
      (ixgbe), I found that SYNACK messages were dropped at Qdisc level
      because we send them all on a single queue.
      
      Obvious choice is to reflect incoming SYN packet @queue_mapping to
      SYNACK packet.
      
      Under stress, my machine could only send 25.000 SYNACK per second (for
      200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues.
      
      After patch, not a single SYNACK is dropped.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fff32699
    • E
      tcp: do not create inetpeer on SYNACK message · 7433819a
      Eric Dumazet 提交于
      Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting
      larger and larger, using lots of memory and cpu time.
      
      tcp_v4_send_synack()
      ->inet_csk_route_req()
       ->ip_route_output_flow()
        ->rt_set_nexthop()
         ->rt_init_metrics()
          ->inet_getpeer( create = true)
      
      This is a side effect of commit a4daad6b (net: Pre-COW metrics for
      TCP) added in 2.6.39
      
      Possible solution :
      
      Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS
      
      Before patch :
      
      # grep peer /proc/slabinfo
      inet_peer_cache   4175430 4175430    192   42    2 : tunables    0    0    0 : slabdata  99415  99415      0
      
      Samples: 41K of event 'cycles', Event count (approx.): 30716565122
      +  20,24%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_getpeer
      +   8,19%      ksoftirqd/0  [kernel.kallsyms]           [k] peer_avl_rebalance.isra.1
      +   4,81%      ksoftirqd/0  [kernel.kallsyms]           [k] sha_transform
      +   3,64%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_table_lookup
      +   2,36%      ksoftirqd/0  [ixgbe]                     [k] ixgbe_poll
      +   2,16%      ksoftirqd/0  [kernel.kallsyms]           [k] __ip_route_output_key
      +   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] kernel_map_pages
      +   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] ip_route_input_common
      +   2,01%      ksoftirqd/0  [kernel.kallsyms]           [k] __inet_lookup_established
      +   1,83%      ksoftirqd/0  [kernel.kallsyms]           [k] md5_transform
      +   1,75%      ksoftirqd/0  [kernel.kallsyms]           [k] check_leaf.isra.9
      +   1,49%      ksoftirqd/0  [kernel.kallsyms]           [k] ipt_do_table
      +   1,46%      ksoftirqd/0  [kernel.kallsyms]           [k] hrtimer_interrupt
      +   1,45%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_alloc
      +   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_csk_search_req
      +   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] __netif_receive_skb
      +   1,16%      ksoftirqd/0  [kernel.kallsyms]           [k] copy_user_generic_string
      +   1,15%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_free
      +   1,02%      ksoftirqd/0  [kernel.kallsyms]           [k] tcp_make_synack
      +   0,93%      ksoftirqd/0  [kernel.kallsyms]           [k] _raw_spin_lock_bh
      +   0,87%      ksoftirqd/0  [kernel.kallsyms]           [k] __call_rcu
      +   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] rt_garbage_collect
      +   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_rules_lookup
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7433819a
  11. 01 6月, 2012 2 次提交