1. 10 4月, 2008 1 次提交
    • F
      [Syncookies]: Add support for TCP options via timestamps. · 4dfc2817
      Florian Westphal 提交于
      Allow the use of SACK and window scaling when syncookies are used
      and the client supports tcp timestamps. Options are encoded into
      the timestamp sent in the syn-ack and restored from the timestamp
      echo when the ack is received.
      
      Based on earlier work by Glenn Griffin.
      This patch avoids increasing the size of structs by encoding TCP
      options into the least significant bits of the timestamp and
      by not using any 'timestamp offset'.
      
      The downside is that the timestamp sent in the packet after the synack
      will increase by several seconds.
      
      changes since v1:
       don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
       and have ipv6/syncookies.c use it.
       Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()
      Reviewed-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dfc2817
  2. 24 3月, 2008 1 次提交
  3. 22 3月, 2008 1 次提交
    • P
      [TCP]: TCP_DEFER_ACCEPT updates - process as established · ec3c0982
      Patrick McManus 提交于
      Change TCP_DEFER_ACCEPT implementation so that it transitions a
      connection to ESTABLISHED after handshake is complete instead of
      leaving it in SYN-RECV until some data arrvies. Place connection in
      accept queue when first data packet arrives from slow path.
      
      Benefits:
        - established connection is now reset if it never makes it
         to the accept queue
      
       - diagnostic state of established matches with the packet traces
         showing completed handshake
      
       - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
         enforced with reasonable accuracy instead of rounding up to next
         exponential back-off of syn-ack retry.
      Signed-off-by: NPatrick McManus <mcmanus@ducksong.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec3c0982
  4. 21 3月, 2008 2 次提交
  5. 04 3月, 2008 1 次提交
  6. 01 3月, 2008 1 次提交
  7. 29 1月, 2008 13 次提交
    • I
      [TCP]: Uninline tcp_is_cwnd_limited · cea14e0e
      Ilpo Järvinen 提交于
      net/ipv4/tcp_cong.c:
        tcp_reno_cong_avoid |  -65
       1 function changed, 65 bytes removed, diff: -65
      
      net/ipv4/arp.c:
        arp_ignore |   -5
       1 function changed, 5 bytes removed, diff: -5
      
      net/ipv4/tcp_bic.c:
        bictcp_cong_avoid |  -57
       1 function changed, 57 bytes removed, diff: -57
      
      net/ipv4/tcp_cubic.c:
        bictcp_cong_avoid |  -61
       1 function changed, 61 bytes removed, diff: -61
      
      net/ipv4/tcp_highspeed.c:
        hstcp_cong_avoid |  -63
       1 function changed, 63 bytes removed, diff: -63
      
      net/ipv4/tcp_hybla.c:
        hybla_cong_avoid |  -85
       1 function changed, 85 bytes removed, diff: -85
      
      net/ipv4/tcp_htcp.c:
        htcp_cong_avoid |  -57
       1 function changed, 57 bytes removed, diff: -57
      
      net/ipv4/tcp_veno.c:
        tcp_veno_cong_avoid |  -52
       1 function changed, 52 bytes removed, diff: -52
      
      net/ipv4/tcp_scalable.c:
        tcp_scalable_cong_avoid |  -61
       1 function changed, 61 bytes removed, diff: -61
      
      net/ipv4/tcp_yeah.c:
        tcp_yeah_cong_avoid |  -75
       1 function changed, 75 bytes removed, diff: -75
      
      net/ipv4/tcp_illinois.c:
        tcp_illinois_cong_avoid |  -54
       1 function changed, 54 bytes removed, diff: -54
      
      net/dccp/ccids/ccid3.c:
        ccid3_update_send_interval |   -7
        ccid3_hc_tx_packet_recv    |   +7
       2 functions changed, 7 bytes added, 7 bytes removed, diff: +0
      
      net/ipv4/tcp_cong.c:
        tcp_is_cwnd_limited |  +88
       1 function changed, 88 bytes added, diff: +88
      
      built-in.o:
       14 functions changed, 95 bytes added, 642 bytes removed, diff: -547
      
      ...Again some gcc artifacts visible as well.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cea14e0e
    • I
      [TCP]: Uninline tcp_set_state · 490d5046
      Ilpo Järvinen 提交于
      net/ipv4/tcp.c:
        tcp_close_state | -226
        tcp_done        | -145
        tcp_close       | -564
        tcp_disconnect  | -141
       4 functions changed, 1076 bytes removed, diff: -1076
      
      net/ipv4/tcp_input.c:
        tcp_fin               |  -86
        tcp_rcv_state_process | -164
       2 functions changed, 250 bytes removed, diff: -250
      
      net/ipv4/tcp_ipv4.c:
        tcp_v4_connect | -209
       1 function changed, 209 bytes removed, diff: -209
      
      net/ipv4/arp.c:
        arp_ignore |   +5
       1 function changed, 5 bytes added, diff: +5
      
      net/ipv6/tcp_ipv6.c:
        tcp_v6_connect | -158
       1 function changed, 158 bytes removed, diff: -158
      
      net/sunrpc/xprtsock.c:
        xs_sendpages |   -2
       1 function changed, 2 bytes removed, diff: -2
      
      net/dccp/ccids/ccid3.c:
        ccid3_update_send_interval |   +7
       1 function changed, 7 bytes added, diff: +7
      
      net/ipv4/tcp.c:
        tcp_set_state | +238
       1 function changed, 238 bytes added, diff: +238
      
      built-in.o:
       12 functions changed, 250 bytes added, 1695 bytes removed, diff: -1445
      
      I've no explanation why some unrelated changes seem to occur
      consistently as well (arp_ignore, ccid3_update_send_interval;
      I checked the arp_ignore asm and it seems to be due to some
      reordered of operation order causing some extra opcodes to be
      generated). Still, the benefits are pretty obvious from the
      codiff's results.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      490d5046
    • I
      [TCP]: Remove TCPCB_URG & TCPCB_AT_TAIL as unnecessary · 4828e7f4
      Ilpo Järvinen 提交于
      The snd_up check should be enough. I suspect this has been
      there to provide a minor optimization in clean_rtx_queue which
      used to have a small if (!->sacked) block which could skip
      snd_up check among the other work.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4828e7f4
    • I
    • H
      [NET] CORE: Introducing new memory accounting interface. · 3ab224be
      Hideo Aoki 提交于
      This patch introduces new memory accounting functions for each network
      protocol. Most of them are renamed from memory accounting functions
      for stream protocols. At the same time, some stream memory accounting
      functions are removed since other functions do same thing.
      
      Renaming:
      	sk_stream_free_skb()		->	sk_wmem_free_skb()
      	__sk_stream_mem_reclaim()	->	__sk_mem_reclaim()
      	sk_stream_mem_reclaim()		->	sk_mem_reclaim()
      	sk_stream_mem_schedule 		->    	__sk_mem_schedule()
      	sk_stream_pages()      		->	sk_mem_pages()
      	sk_stream_rmem_schedule()	->	sk_rmem_schedule()
      	sk_stream_wmem_schedule()	->	sk_wmem_schedule()
      	sk_charge_skb()			->	sk_mem_charge()
      
      Removeing
      	sk_stream_rfree():	consolidates into sock_rfree()
      	sk_stream_set_owner_r(): consolidates into skb_set_owner_r()
      	sk_stream_mem_schedule()
      
      The following functions are added.
          	sk_has_account(): check if the protocol supports accounting
      	sk_mem_uncharge(): do the opposite of sk_mem_charge()
      
      In addition, to achieve consolidation, updating sk_wmem_queued is
      removed from sk_mem_charge().
      
      Next, to consolidate memory accounting functions, this patch adds
      memory accounting calls to network core functions. Moreover, present
      memory accounting call is renamed to new accounting call.
      
      Finally we replace present memory accounting calls with new interface
      in TCP and SCTP.
      Signed-off-by: NTakahiro Yasui <tyasui@redhat.com>
      Signed-off-by: NHideo Aoki <haoki@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ab224be
    • Y
      [TCP]: Convert several length variable to unsigned. · 9cb5734e
      YOSHIFUJI Hideaki 提交于
      Several length variables cannot be negative, so convert int to
      unsigned int.  This also allows us to do sane shift operations
      on those variables.
      Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cb5734e
    • I
      [TCP]: Abstract tp->highest_sack accessing & point to next skb · 6859d494
      Ilpo Järvinen 提交于
      Pointing to the next skb is necessary to avoid referencing
      already SACKed skbs which will soon be on a separate list.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6859d494
    • I
    • I
    • I
      [TCP]: Move FRTO checks out from write queue abstraction funcs · 8512430e
      Ilpo Järvinen 提交于
      Better place exists in update_send_head (other non-queue related
      adjustments are done there as well) which is the only caller of
      tcp_advance_send_head (now that the bogus call from mtu_probe is
      gone).
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8512430e
    • I
      [TCP]: Rewrite SACK block processing & sack_recv_cache use · 68f8353b
      Ilpo Järvinen 提交于
      Key points of this patch are:
      
        - In case new SACK information is advance only type, no skb
          processing below previously discovered highest point is done
        - Optimize cases below highest point too since there's no need
          to always go up to highest point (which is very likely still
          present in that SACK), this is not entirely true though
          because I'm dropping the fastpath_skb_hint which could
          previously optimize those cases even better. Whether that's
          significant, I'm not too sure.
      
      Currently it will provide skipping by walking. Combined with
      RB-tree, all skipping would become fast too regardless of window
      size (can be done incrementally later).
      
      Previously a number of cases in TCP SACK processing fails to
      take advantage of costly stored information in sack_recv_cache,
      most importantly, expected events such as cumulative ACK and new
      hole ACKs. Processing on such ACKs result in rather long walks
      building up latencies (which easily gets nasty when window is
      huge). Those latencies are often completely unnecessary
      compared with the amount of _new_ information received, usually
      for cumulative ACK there's no new information at all, yet TCP
      walks whole queue unnecessary potentially taking a number of
      costly cache misses on the way, etc.!
      
      Since the inclusion of highest_sack, there's a lot information
      that is very likely redundant (SACK fastpath hint stuff,
      fackets_out, highest_sack), though there's no ultimate guarantee
      that they'll remain the same whole the time (in all unearthly
      scenarios). Take advantage of this knowledge here and drop
      fastpath hint and use direct access to highest SACKed skb as
      a replacement.
      
      Effectively "special cased" fastpath is dropped. This change
      adds some complexity to introduce better coveraged "fastpath",
      though the added complexity should make TCP behave more cache
      friendly.
      
      The current ACK's SACK blocks are compared against each cached
      block individially and only ranges that are new are then scanned
      by the high constant walk. For other parts of write queue, even
      when in previously known part of the SACK blocks, a faster skip
      function is used (if necessary at all). In addition, whenever
      possible, TCP fast-forwards to highest_sack skb that was made
      available by an earlier patch. In typical case, no other things
      but this fast-forward and mandatory markings after that occur
      making the access pattern quite similar to the former fastpath
      "special case".
      
      DSACKs are special case that must always be walked.
      
      The local to recv_sack_cache copying could be more intelligent
      w.r.t DSACKs which are likely to be there only once but that
      is left to a separate patch.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68f8353b
    • I
      [TCP]: Convert highest_sack to sk_buff to allow direct access · a47e5a98
      Ilpo Järvinen 提交于
      It is going to replace the sack fastpath hint quite soon... :-)
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a47e5a98
    • J
      [TCP]: Splice receive support. · 9c55e01c
      Jens Axboe 提交于
      Support for network splice receive.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c55e01c
  8. 20 11月, 2007 1 次提交
  9. 24 10月, 2007 1 次提交
  10. 11 10月, 2007 13 次提交
  11. 29 9月, 2007 1 次提交
    • D
      [TCP]: Fix MD5 signature handling on big-endian. · f8ab18d2
      David S. Miller 提交于
      Based upon a report and initial patch by Peter Lieven.
      
      tcp4_md5sig_key and tcp6_md5sig_key need to start with
      the exact same members as tcp_md5sig_key.  Because they
      are both cast to that type by tcp_v{4,6}_md5_do_lookup().
      
      Unfortunately tcp{4,6}_md5sig_key use a u16 for the key
      length instead of a u8, which is what tcp_md5sig_key
      uses.  This just so happens to work by accident on
      little-endian, but on big-endian it doesn't.
      
      Instead of casting, just place tcp_md5sig_key as the first member of
      the address-family specific structures, adjust the access sites, and
      kill off the ugly casts.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8ab18d2
  12. 03 8月, 2007 1 次提交
    • D
      [TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). · 3516ffb0
      David S. Miller 提交于
      As discovered by Evegniy Polyakov, if we try to sendmsg after
      a connection reset, we can do incredibly stupid things.
      
      The core issue is that inet_sendmsg() tries to autobind the
      socket, but we should never do that for TCP.  Instead we should
      just go straight into TCP's sendmsg() code which will do all
      of the necessary state and pending socket error checks.
      
      TCP's sendpage already directly vectors to tcp_sendpage(), so this
      merely brings sendmsg() in line with that.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3516ffb0
  13. 31 7月, 2007 1 次提交
  14. 18 7月, 2007 1 次提交
  15. 31 5月, 2007 1 次提交