1. 25 3月, 2008 1 次提交
  2. 23 3月, 2008 1 次提交
    • P
      [SOCK]: Add udp_hash member to struct proto. · 39d8cda7
      Pavel Emelyanov 提交于
      Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to 
      struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 
      and -v6 protocols.
      
      The result is not that exciting, but it removes some levels of
      indirection in udpxxx_get_port and saves some space in code and text.
      
      The first step is to union existing hashinfo and new udp_hash on the
      struct proto and give a name to this union, since future initialization 
      of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc 
      warning about inability to initialize anonymous member this way.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39d8cda7
  3. 22 3月, 2008 2 次提交
    • P
      [TCP]: TCP_DEFER_ACCEPT updates - process as established · ec3c0982
      Patrick McManus 提交于
      Change TCP_DEFER_ACCEPT implementation so that it transitions a
      connection to ESTABLISHED after handshake is complete instead of
      leaving it in SYN-RECV until some data arrvies. Place connection in
      accept queue when first data packet arrives from slow path.
      
      Benefits:
        - established connection is now reset if it never makes it
         to the accept queue
      
       - diagnostic state of established matches with the packet traces
         showing completed handshake
      
       - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
         enforced with reasonable accuracy instead of rounding up to next
         exponential back-off of syn-ack retry.
      Signed-off-by: NPatrick McManus <mcmanus@ducksong.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec3c0982
    • P
      [NET]: NULL pointer dereference and other nasty things in /proc/net/(tcp|udp)[6] · 28518fc1
      Pavel Emelyanov 提交于
      Commits f40c81 ([NETNS][IPV4] tcp - make proc handle the network
      namespaces) and a91275 ([NETNS][IPV6] udp - make proc handle the
      network namespace) both introduced bad checks on sockets and tw
      buckets to belong to proper net namespace.
      
      I.e. when checking for socket to belong to given net and family the
      
      	do {
      		sk = sk_next(sk);
      	} while (sk && sk->sk_net != net && sk->sk_family != family);
      
      constructions were used. This is wrong, since as soon as the
      sk->sk_net fits the net the socket is immediately returned, even if it
      belongs to other family.
      
      As the result four /proc/net/(udp|tcp)[6] entries show wrong info.
      The udp6 entry even oopses when dereferencing inet6_sk(sk) pointer:
      
      static void udp6_sock_seq_show(struct seq_file *seq, struct sock *sp, int bucket)
      {
      	...
              struct ipv6_pinfo *np = inet6_sk(sp);
      	...
      
              dest  = &np->daddr; /* will be NULL for AF_INET sockets */
      	...
      	seq_printf(...
      	           dest->s6_addr32[0], dest->s6_addr32[1],
                         dest->s6_addr32[2], dest->s6_addr32[3],
      	...
      
      Fix it by converting && to ||.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28518fc1
  4. 21 3月, 2008 2 次提交
  5. 06 3月, 2008 1 次提交
  6. 04 3月, 2008 1 次提交
  7. 01 3月, 2008 2 次提交
  8. 18 2月, 2008 1 次提交
  9. 03 2月, 2008 1 次提交
    • A
      [SOCK] proto: Add hashinfo member to struct proto · ab1e0a13
      Arnaldo Carvalho de Melo 提交于
      This way we can remove TCP and DCCP specific versions of
      
      sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port
      sk->sk_prot->hash:     inet_hash is directly used, only v6 need
                             a specific version to deal with mapped sockets
      sk->sk_prot->unhash:   both v4 and v6 use inet_hash directly
      
      struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so
      that inet_csk_get_port can find the per family routine.
      
      Now only the lookup routines receive as a parameter a struct inet_hashtable.
      
      With this we further reuse code, reducing the difference among INET transport
      protocols.
      
      Eventually work has to be done on UDP and SCTP to make them share this
      infrastructure and get as a bonus inet_diag interfaces so that iproute can be
      used with these protocols.
      
      net-2.6/net/ipv4/inet_hashtables.c:
        struct proto			     |   +8
        struct inet_connection_sock_af_ops |   +8
       2 structs changed
        __inet_hash_nolisten               |  +18
        __inet_hash                        | -210
        inet_put_port                      |   +8
        inet_bind_bucket_create            |   +1
        __inet_hash_connect                |   -8
       5 functions changed, 27 bytes added, 218 bytes removed, diff: -191
      
      net-2.6/net/core/sock.c:
        proto_seq_show                     |   +3
       1 function changed, 3 bytes added, diff: +3
      
      net-2.6/net/ipv4/inet_connection_sock.c:
        inet_csk_get_port                  |  +15
       1 function changed, 15 bytes added, diff: +15
      
      net-2.6/net/ipv4/tcp.c:
        tcp_set_state                      |   -7
       1 function changed, 7 bytes removed, diff: -7
      
      net-2.6/net/ipv4/tcp_ipv4.c:
        tcp_v4_get_port                    |  -31
        tcp_v4_hash                        |  -48
        tcp_v4_destroy_sock                |   -7
        tcp_v4_syn_recv_sock               |   -2
        tcp_unhash                         | -179
       5 functions changed, 267 bytes removed, diff: -267
      
      net-2.6/net/ipv6/inet6_hashtables.c:
        __inet6_hash |   +8
       1 function changed, 8 bytes added, diff: +8
      
      net-2.6/net/ipv4/inet_hashtables.c:
        inet_unhash                        | +190
        inet_hash                          | +242
       2 functions changed, 432 bytes added, diff: +432
      
      vmlinux:
       16 functions changed, 485 bytes added, 492 bytes removed, diff: -7
      
      /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c:
        tcp_v6_get_port                    |  -31
        tcp_v6_hash                        |   -7
        tcp_v6_syn_recv_sock               |   -9
       3 functions changed, 47 bytes removed, diff: -47
      
      /home/acme/git/net-2.6/net/dccp/proto.c:
        dccp_destroy_sock                  |   -7
        dccp_unhash                        | -179
        dccp_hash                          |  -49
        dccp_set_state                     |   -7
        dccp_done                          |   +1
       5 functions changed, 1 bytes added, 242 bytes removed, diff: -241
      
      /home/acme/git/net-2.6/net/dccp/ipv4.c:
        dccp_v4_get_port                   |  -31
        dccp_v4_request_recv_sock          |   -2
       2 functions changed, 33 bytes removed, diff: -33
      
      /home/acme/git/net-2.6/net/dccp/ipv6.c:
        dccp_v6_get_port                   |  -31
        dccp_v6_hash                       |   -7
        dccp_v6_request_recv_sock          |   +5
       3 functions changed, 5 bytes added, 38 bytes removed, diff: -33
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab1e0a13
  10. 01 2月, 2008 1 次提交
  11. 29 1月, 2008 2 次提交
  12. 21 11月, 2007 2 次提交
  13. 07 11月, 2007 2 次提交
  14. 02 11月, 2007 1 次提交
    • J
      [SG] Get rid of __sg_mark_end() · c46f2334
      Jens Axboe 提交于
      sg_mark_end() overwrites the page_link information, but all users want
      __sg_mark_end() behaviour where we just set the end bit. That is the most
      natural way to use the sg list, since you'll fill it in and then mark the
      end point.
      
      So change sg_mark_end() to only set the termination bit. Add a sg_magic
      debug check as well, and clear a chain pointer if it is set.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      c46f2334
  15. 31 10月, 2007 1 次提交
    • D
      [NET]: Fix incorrect sg_mark_end() calls. · 51c739d1
      David S. Miller 提交于
      This fixes scatterlist corruptions added by
      
      	commit 68e3f5dd
      	[CRYPTO] users: Fix up scatterlist conversion errors
      
      The issue is that the code calls sg_mark_end() which clobbers the
      sg_page() pointer of the final scatterlist entry.
      
      The first part fo the fix makes skb_to_sgvec() do __sg_mark_end().
      
      After considering all skb_to_sgvec() call sites the most correct
      solution is to call __sg_mark_end() in skb_to_sgvec() since that is
      what all of the callers would end up doing anyways.
      
      I suspect this might have fixed some problems in virtio_net which is
      the sole non-crypto user of skb_to_sgvec().
      
      Other similar sg_mark_end() cases were converted over to
      __sg_mark_end() as well.
      
      Arguably sg_mark_end() is a poorly named function because it doesn't
      just "mark", it clears out the page pointer as a side effect, which is
      what led to these bugs in the first place.
      
      The one remaining plain sg_mark_end() call is in scsi_alloc_sgtable()
      and arguably it could be converted to __sg_mark_end() if only so that
      we can delete this confusing interface from linux/scatterlist.h
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51c739d1
  16. 30 10月, 2007 1 次提交
  17. 26 10月, 2007 1 次提交
  18. 11 10月, 2007 2 次提交
    • S
      [INET]: local port range robustness · 227b60f5
      Stephen Hemminger 提交于
      Expansion of original idea from Denis V. Lunev <den@openvz.org>
      
      Add robustness and locking to the local_port_range sysctl.
      1. Enforce that low < high when setting.
      2. Use seqlock to ensure atomic update.
      
      The locking might seem like overkill, but there are
      cases where sysadmin might want to change value in the
      middle of a DoS attack.
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      227b60f5
    • E
      [NET]: Make /proc/net per network namespace · 457c4cbc
      Eric W. Biederman 提交于
      This patch makes /proc/net per network namespace.  It modifies the global
      variables proc_net and proc_net_stat to be per network namespace.
      The proc_net file helpers are modified to take a network namespace argument,
      and all of their callers are fixed to pass &init_net for that argument.
      This ensures that all of the /proc/net files are only visible and
      usable in the initial network namespace until the code behind them
      has been updated to be handle multiple network namespaces.
      
      Making /proc/net per namespace is necessary as at least some files
      in /proc/net depend upon the set of network devices which is per
      network namespace, and even more files in /proc/net have contents
      that are relevant to a single network namespace.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      457c4cbc
  19. 29 9月, 2007 1 次提交
    • D
      [TCP]: Fix MD5 signature handling on big-endian. · f8ab18d2
      David S. Miller 提交于
      Based upon a report and initial patch by Peter Lieven.
      
      tcp4_md5sig_key and tcp6_md5sig_key need to start with
      the exact same members as tcp_md5sig_key.  Because they
      are both cast to that type by tcp_v{4,6}_md5_do_lookup().
      
      Unfortunately tcp{4,6}_md5sig_key use a u16 for the key
      length instead of a u8, which is what tcp_md5sig_key
      uses.  This just so happens to work by accident on
      little-endian, but on big-endian it doesn't.
      
      Instead of casting, just place tcp_md5sig_key as the first member of
      the address-family specific structures, adjust the access sites, and
      kill off the ugly casts.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8ab18d2
  20. 03 8月, 2007 1 次提交
    • D
      [TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). · 3516ffb0
      David S. Miller 提交于
      As discovered by Evegniy Polyakov, if we try to sendmsg after
      a connection reset, we can do incredibly stupid things.
      
      The core issue is that inet_sendmsg() tries to autobind the
      socket, but we should never do that for TCP.  Instead we should
      just go straight into TCP's sendmsg() code which will do all
      of the necessary state and pending socket error checks.
      
      TCP's sendpage already directly vectors to tcp_sendpage(), so this
      merely brings sendmsg() in line with that.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3516ffb0
  21. 11 7月, 2007 1 次提交
    • H
      [TCPv4]: Improve BH latency in /proc/net/tcp · a7ab4b50
      Herbert Xu 提交于
      Currently the code for /proc/net/tcp disable BH while iterating
      over the entire established hash table.  Even though we call
      cond_resched_softirq for each entry, we still won't process
      softirq's as regularly as we would otherwise do which results
      in poor performance when the system is loaded near capacity.
      
      This anomaly comes from the 2.4 code where this was all in a
      single function and the local_bh_disable might have made sense
      as a small optimisation.
      
      The cost of each local_bh_disable is so small when compared
      against the increased latency in keeping it disabled over a
      large but mostly empty TCP established hash table that we
      should just move it to the individual read_lock/read_unlock
      calls as we do in inet_diag.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7ab4b50
  22. 13 6月, 2007 1 次提交
  23. 08 6月, 2007 1 次提交
  24. 04 6月, 2007 1 次提交
  25. 26 4月, 2007 9 次提交