1. 20 7月, 2012 2 次提交
  2. 17 7月, 2012 1 次提交
    • E
      tcp: implement RFC 5961 3.2 · 282f23c6
      Eric Dumazet 提交于
      Implement the RFC 5691 mitigation against Blind
      Reset attack using RST bit.
      
      Idea is to validate incoming RST sequence,
      to match RCV.NXT value, instead of previouly accepted
      window : (RCV.NXT <= SEG.SEQ < RCV.NXT+RCV.WND)
      
      If sequence is in window but not an exact match, send
      a "challenge ACK", so that the other part can resend an
      RST with the appropriate sequence.
      
      Add a new sysctl, tcp_challenge_ack_limit, to limit
      number of challenge ACK sent per second.
      
      Add a new SNMP counter to count number of challenge acks sent.
      (netstat -s | grep TCPChallengeACK)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Kiran Kumar Kella <kkiran@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      282f23c6
  3. 12 7月, 2012 1 次提交
    • E
      tcp: TCP Small Queues · 46d3ceab
      Eric Dumazet 提交于
      This introduce TSQ (TCP Small Queues)
      
      TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
      device queues), to reduce RTT and cwnd bias, part of the bufferbloat
      problem.
      
      sk->sk_wmem_alloc not allowed to grow above a given limit,
      allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
      given time.
      
      TSO packets are sized/capped to half the limit, so that we have two
      TSO packets in flight, allowing better bandwidth use.
      
      As a side effect, setting the limit to 40000 automatically reduces the
      standard gso max limit (65536) to 40000/2 : It can help to reduce
      latencies of high prio packets, having smaller TSO packets.
      
      This means we divert sock_wfree() to a tcp_wfree() handler, to
      queue/send following frames when skb_orphan() [2] is called for the
      already queued skbs.
      
      Results on my dev machines (tg3/ixgbe nics) are really impressive,
      using standard pfifo_fast, and with or without TSO/GSO.
      
      Without reduction of nominal bandwidth, we have reduction of buffering
      per bulk sender :
      < 1ms on Gbit (instead of 50ms with TSO)
      < 8ms on 100Mbit (instead of 132 ms)
      
      I no longer have 4 MBytes backlogged in qdisc by a single netperf
      session, and both side socket autotuning no longer use 4 Mbytes.
      
      As skb destructor cannot restart xmit itself ( as qdisc lock might be
      taken at this point ), we delegate the work to a tasklet. We use one
      tasklest per cpu for performance reasons.
      
      If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
      This flag is tested in a new protocol method called from release_sock(),
      to eventually send new segments.
      
      [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
      [2] skb_orphan() is usually called at TX completion time,
        but some drivers call it in their start_xmit() handler.
        These drivers should at least use BQL, or else a single TCP
        session can still fill the whole NIC TX ring, since TSQ will
        have no effect.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Dave Taht <dave.taht@bufferbloat.net>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Matt Mathis <mattmathis@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46d3ceab
  4. 11 7月, 2012 1 次提交
  5. 01 7月, 2012 2 次提交
  6. 26 6月, 2012 1 次提交
  7. 20 6月, 2012 1 次提交
  8. 19 6月, 2012 1 次提交
    • M
      batman-adv: Add get_ethtool_stats() support · f8214865
      Martin Hundebøll 提交于
      Added additional counters in a bat_stats structure, which are exported
      through the ethtool api. The counters are specific to batman-adv and
      includes:
       forwarded packets and bytes
       management packets and bytes (aggregated OGMs at this point)
       translation table packets
      
      New counters are added by extending "enum bat_counters" in types.h and
      adding corresponding  descriptive string(s) to bat_counters_strings in
      soft-iface.c.
      
      Counters are increased by calling batadv_add_counter() and incremented
      by one by calling batadv_inc_counter().
      Signed-off-by: NMartin Hundebøll <martin@hundeboll.net>
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      f8214865
  9. 13 6月, 2012 1 次提交
    • T
      ipv4: Add interface option to enable routing of 127.0.0.0/8 · d0daebc3
      Thomas Graf 提交于
      Routing of 127/8 is tradtionally forbidden, we consider
      packets from that address block martian when routing and do
      not process corresponding ARP requests.
      
      This is a sane default but renders a huge address space
      practically unuseable.
      
      The RFC states that no address within the 127/8 block should
      ever appear on any network anywhere but it does not forbid
      the use of such addresses outside of the loopback device in
      particular. For example to address a pool of virtual guests
      behind a load balancer.
      
      This patch adds a new interface option 'route_localnet'
      enabling routing of the 127/8 address block and processing
      of ARP requests on a specific interface.
      
      Note that for the feature to work, the default local route
      covering 127/8 dev lo needs to be removed.
      
      Example:
        $ sysctl -w net.ipv4.conf.eth0.route_localnet=1
        $ ip route del 127.0.0.0/8 dev lo table local
        $ ip addr add 127.1.0.1/16 dev eth0
        $ ip route flush cache
      
      V2: Fix invalid check to auto flush cache (thanks davem)
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0daebc3
  10. 07 6月, 2012 1 次提交
  11. 24 5月, 2012 1 次提交
  12. 18 5月, 2012 1 次提交
    • P
      drivers/net: delete all code/drivers depending on CONFIG_MCA · a5e371f6
      Paul Gortmaker 提交于
      The support for CONFIG_MCA is being removed, since the 20
      year old hardware simply isn't capable of meeting today's
      software demands on CPU and memory resources.
      
      This commit removes any MCA specific net drivers, and removes
      any MCA specific probe/support code from drivers that were
      doing a dual ISA/MCA role.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      a5e371f6
  13. 17 5月, 2012 1 次提交
  14. 16 5月, 2012 1 次提交
    • P
      tokenring: delete all remaining driver support · ee446fd5
      Paul Gortmaker 提交于
      This represents the mass deletion of the of the tokenring support.
      
      It gets rid of:
        - the net/tr.c which the drivers depended on
        - the drivers/net component
        - the Kbuild infrastructure around it
        - any tokenring related CONFIG_ settings in any defconfigs
        - the tokenring headers in the include/linux dir
        - the firmware associated with the tokenring drivers.
        - any associated token ring documentation.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      ee446fd5
  15. 14 5月, 2012 1 次提交
    • S
      batman-adv: README cleanups · a77e8c61
      Sven Eckelmann 提交于
      - Add routing_algo
      
      - Remove date from README:
      The date has to be updated when a patch touches the README. Therefore, nearly
      every feature will modify this date. It can happens quite often that not only
      one feature is currently in development or waiting on the mailinglist. This
      creates merge conflicts when applying a patchset.
      
      The date itself doesn't provide any additional information when this file is
      only available in a release tarball or as part of a SCM repository.
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NAntonio Quartulli <ordex@autistici.org>
      a77e8c61
  16. 09 5月, 2012 1 次提交
  17. 03 5月, 2012 2 次提交
    • E
      tcp: change tcp_adv_win_scale and tcp_rmem[2] · b49960a0
      Eric Dumazet 提交于
      tcp_adv_win_scale default value is 2, meaning we expect a good citizen
      skb to have skb->len / skb->truesize ratio of 75% (3/4)
      
      In 2.6 kernels we (mis)accounted for typical MSS=1460 frame :
      1536 + 64 + 256 = 1856 'estimated truesize', and 1856 * 3/4 = 1392.
      So these skbs were considered as not bloated.
      
      With recent truesize fixes, a typical MSS=1460 frame truesize is now the
      more precise :
      2048 + 256 = 2304. But 2304 * 3/4 = 1728.
      So these skb are not good citizen anymore, because 1460 < 1728
      
      (GRO can escape this problem because it build skbs with a too low
      truesize.)
      
      This also means tcp advertises a too optimistic window for a given
      allocated rcvspace : When receiving frames, sk_rmem_alloc can hit
      sk_rcvbuf limit and we call tcp_prune_queue()/tcp_collapse() too often,
      especially when application is slow to drain its receive queue or in
      case of losses (netperf is fast, scp is slow). This is a major latency
      source.
      
      We should adjust the len/truesize ratio to 50% instead of 75%
      
      This patch :
      
      1) changes tcp_adv_win_scale default to 1 instead of 2
      
      2) increase tcp_rmem[2] limit from 4MB to 6MB to take into account
      better truesize tracking and to allow autotuning tcp receive window to
      reach same value than before. Note that same amount of kernel memory is
      consumed compared to 2.6 kernels.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b49960a0
    • Y
      tcp: early retransmit · eed530b6
      Yuchung Cheng 提交于
      This patch implements RFC 5827 early retransmit (ER) for TCP.
      It reduces DUPACK threshold (dupthresh) if outstanding packets are
      less than 4 to recover losses by fast recovery instead of timeout.
      
      While the algorithm is simple, small but frequent network reordering
      makes this feature dangerous: the connection repeatedly enter
      false recovery and degrade performance. Therefore we implement
      a mitigation suggested in the appendix of the RFC that delays
      entering fast recovery by a small interval, i.e., RTT/4. Currently
      ER is conservative and is disabled for the rest of the connection
      after the first reordering event. A large scale web server
      experiment on the performance impact of ER is summarized in
      section 6 of the paper "Proportional Rate Reduction for TCP”,
      IMC 2011. http://conferences.sigcomm.org/imc/2011/docs/p155.pdf
      
      Note that Linux has a similar feature called THIN_DUPACK. The
      differences are THIN_DUPACK do not mitigate reorderings and is only
      used after slow start. Currently ER is disabled if THIN_DUPACK is
      enabled. I would be happy to merge THIN_DUPACK feature with ER if
      people think it's a good idea.
      
      ER is enabled by sysctl_tcp_early_retrans:
        0: Disables ER
      
        1: Reduce dupthresh to packets_out - 1 when outstanding packets < 4.
      
        2: (Default) reduce dupthresh like mode 1. In addition, delay
           entering fast recovery by RTT/4.
      
      Note: mode 2 is implemented in the third part of this patch series.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eed530b6
  18. 27 4月, 2012 1 次提交
  19. 16 4月, 2012 1 次提交
  20. 11 4月, 2012 3 次提交
    • S
    • S
      batman-adv: add basic bridge loop avoidance code · 23721387
      Simon Wunderlich 提交于
      This second version of the bridge loop avoidance for batman-adv
      avoids loops between the mesh and a backbone (usually a LAN).
      
      By connecting multiple batman-adv mesh nodes to the same ethernet
      segment a loop can be created when the soft-interface is bridged
      into that ethernet segment. A simple visualization of the loop
      involving the most common case - a LAN as ethernet segment:
      
      node1  <-- LAN  -->  node2
        |                   |
      wifi   <-- mesh -->  wifi
      
      Packets from the LAN (e.g. ARP broadcasts) will circle forever from
      node1 or node2 over the mesh back into the LAN.
      
      With this patch, batman recognizes backbone gateways, nodes which are
      part of the mesh and backbone/LAN at the same time. Each backbone
      gateway "claims" clients from within the mesh to handle them
      exclusively. By restricting that only responsible backbone gateways
      may handle their claimed clients traffic, loops are effectively
      avoided.
      Signed-off-by: NSimon Wunderlich <siwu@hrz.tu-chemnitz.de>
      Signed-off-by: NAntonio Quartulli <ordex@autistici.org>
      23721387
    • J
      mac80211: set HT channel before association · 24398e39
      Johannes Berg 提交于
      Changing the channel type during operation is
      confusing to some drivers and will be hard to
      handle in multi-channel scenarios. Instead of
      changing the channel, set it to the right HT
      channel before authenticating/associating and
      don't change it -- just update the 20/40 MHz
      restrictions in rate control as needed when
      changed by the AP.
      
      This also fixes a problem that Paul missed in
      his fix for the "regulatory makes us deaf"
      issue -- when we couldn't use 40 MHz we still
      associated saying we were using 40 MHz, which
      could in similarly broken APs make us never
      even connect successfully.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      24398e39
  21. 06 4月, 2012 5 次提交
  22. 05 4月, 2012 3 次提交
  23. 04 4月, 2012 1 次提交
  24. 31 3月, 2012 2 次提交
  25. 14 3月, 2012 1 次提交
  26. 13 3月, 2012 1 次提交
  27. 07 3月, 2012 1 次提交
  28. 05 3月, 2012 1 次提交