1. 01 4月, 2008 1 次提交
  2. 29 3月, 2008 1 次提交
  3. 27 3月, 2008 1 次提交
    • H
      [IPSEC]: Fix BEET output · 732c8bd5
      Herbert Xu 提交于
      The IPv6 BEET output function is incorrectly including the inner
      header in the payload to be protected.  This causes a crash as
      the packet doesn't actually have that many bytes for a second
      header.
      
      The IPv4 BEET output on the other hand is broken when it comes
      to handling an inner IPv6 header since it always assumes an
      inner IPv4 header.
      
      This patch fixes both by making sure that neither BEET output
      function touches the inner header at all.  All access is now
      done through the protocol-independent cb structure.  Two new
      attributes are added to make this work, the IP header length
      and the IPv4 option length.  They're filled in by the inner
      mode's output function.
      
      Thanks to Joakim Koskela for finding this problem.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      732c8bd5
  4. 25 3月, 2008 2 次提交
    • K
    • P
      [NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3). · fa86d322
      Pavel Emelyanov 提交于
      Proxy neighbors do not have any reference counting, so any caller
      of pneigh_lookup (unless it's a netlink triggered add/del routine)
      should _not_ perform any actions on the found proxy entry. 
      
      There's one exception from this rule - the ipv6's ndisc_recv_ns() 
      uses found entry to check the flags for NTF_ROUTER.
      
      This creates a race between the ndisc and pneigh_delete - after 
      the pneigh is returned to the caller, the nd_tbl.lock is dropped 
      and the deleting procedure may proceed.
      
      One of the fixes would be to add a reference counting, but this
      problem exists for ndisc only. Besides such a patch would be too 
      big for -rc4.
      
      So I propose to introduce a __pneigh_lookup() which is supposed
      to be called with the lock held and use it in ndisc code to check
      the flags on alive pneigh entry.
      
      
      Changes from v2:
      As David noticed, Exported the __pneigh_lookup() to ipv6 module. 
      The checkpatch generates a warning on it, since the EXPORT_SYMBOL 
      does not follow the symbol itself, but in this file all the 
      exports come at the end, so I decided no to break this harmony.
      
      Changes from v1:
      Fixed comments from YOSHIFUJI - indentation of prototype in header
      and the pndisc_check_router() name - and a compilation fix, pointed
      by Daniel - the is_routed was (falsely) considered as uninitialized
      by gcc.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa86d322
  5. 22 3月, 2008 1 次提交
    • D
      [SCTP]: Fix build warnings with IPV6 disabled. · 1233823b
      David S. Miller 提交于
      Introduced by 270637ab
      ("[SCTP]: Fix a race between module load and protosw access")
      
      Reported by Gabriel C:
      
      In file included from net/sctp/sm_statetable.c:50:
      include/net/sctp/sctp.h: In function 'sctp_v6_pf_init':
      include/net/sctp/sctp.h:392: warning: 'return' with a value, in function returning void
      In file included from net/sctp/sm_statefuns.c:62:
      include/net/sctp/sctp.h: In function 'sctp_v6_pf_init':
      include/net/sctp/sctp.h:392: warning: 'return' with a value, in function returning void
       ...
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1233823b
  6. 21 3月, 2008 1 次提交
    • V
      [SCTP]: Fix a race between module load and protosw access · 270637ab
      Vlad Yasevich 提交于
      There is a race is SCTP between the loading of the module
      and the access by the socket layer to the protocol functions.
      In particular, a list of addresss that SCTP maintains is
      not initialized prior to the registration with the protosw.
      Thus it is possible for a user application to gain access
      to SCTP functions before everything has been initialized.
      The problem shows up as odd crashes during connection
      initializtion when we try to access the SCTP address list.
      
      The solution is to refactor how we do registration and
      initialize the lists prior to registering with the protosw.
      Care must be taken since the address list initialization
      depends on some other pieces of SCTP initialization.  Also
      the clean-up in case of failure now also needs to be refactored.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Acked-by: NSridhar Samudrala <sri@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      270637ab
  7. 18 3月, 2008 1 次提交
  8. 13 3月, 2008 1 次提交
    • Z
      [NET]: Fix tbench regression in 2.6.25-rc1 · f1dd9c37
      Zhang Yanmin 提交于
      Comparing with kernel 2.6.24, tbench result has regression with
      2.6.25-rc1.
      
      1) On 2 quad-core processor stoakley: 4%.
      2) On 4 quad-core processor tigerton: more than 30%.
      
      bisect located below patch.
      
      b4ce9277 is first bad commit
      commit b4ce9277
      Author: Herbert Xu <herbert@gondor.apana.org.au>
      Date:   Tue Nov 13 21:33:32 2007 -0800
      
          [IPV6]: Move nfheader_len into rt6_info
      
          The dst member nfheader_len is only used by IPv6.  It's also currently
          creating a rather ugly alignment hole in struct dst.  Therefore this patch
          moves it from there into struct rt6_info.
      
      Above patch changes the cache line alignment, especially member
      __refcnt. I did a testing by adding 2 unsigned long pading before
      lastuse, so the 3 members, lastuse/__refcnt/__use, are moved to next
      cache line. The performance is recovered.
      
      I created a patch to rearrange the members in struct dst_entry.
      
      With Eric and Valdis Kletnieks's suggestion, I made finer arrangement.
      
      1) Move tclassid under ops in case CONFIG_NET_CLS_ROUTE=y. So
         sizeof(dst_entry)=200 no matter if CONFIG_NET_CLS_ROUTE=y/n. I
         tested many patches on my 16-core tigerton by moving tclassid to
         different place. It looks like tclassid could also have impact on
         performance.  If moving tclassid before metrics, or just don't move
         tclassid, the performance isn't good. So I move it behind metrics.
      
      2) Add comments before __refcnt.
      
      On 16-core tigerton:
      
      If CONFIG_NET_CLS_ROUTE=y, the result with below patch is about 18%
      better than the one without the patch;
      
      If CONFIG_NET_CLS_ROUTE=n, the result with below patch is about 30%
      better than the one without the patch.
      
      With 32bit 2.6.25-rc1 on 8-core stoakley, the new patch doesn't
      introduce regression.
      
      Thank Eric, Valdis, and David!
      Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com>
      Acked-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1dd9c37
  9. 11 3月, 2008 1 次提交
  10. 08 3月, 2008 1 次提交
    • P
      [NET]: Make /proc/net a symlink on /proc/self/net (v3) · e9720acd
      Pavel Emelyanov 提交于
      Current /proc/net is done with so called "shadows", but current
      implementation is broken and has little chances to get fixed.
      
      The problem is that dentries subtree of /proc/net directory has
      fancy revalidation rules to make processes living in different
      net namespaces see different entries in /proc/net subtree, but
      currently, tasks see in the /proc/net subdir the contents of any
      other namespace, depending on who opened the file first.
      
      The proposed fix is to turn /proc/net into a symlink, which points
      to /proc/self/net, which in turn shows what previously was in
      /proc/net - the network-related info, from the net namespace the
      appropriate task lives in.
      
      # ls -l /proc/net
      lrwxrwxrwx  1 root root 8 Mar  5 15:17 /proc/net -> self/net
      
      In other words - this behaves like /proc/mounts, but unlike
      "mounts", "net" is not a file, but a directory.
      
      Changes from v2:
      * Fixed discrepancy of /proc/net nlink count and selinux labeling
        screwup pointed out by Stephen.
      
        To get the correct nlink count the ->getattr callback for /proc/net
        is overridden to read one from the net->proc_net entry.
      
        To make selinux still work the net->proc_net entry is initialized
        properly, i.e. with the "net" name and the proc_net parent.
      
      Selinux fixes are
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      
      Changes from v1:
      * Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9720acd
  11. 06 3月, 2008 2 次提交
  12. 05 3月, 2008 1 次提交
    • D
      [TCP]: Improve ipv4 established hash function. · 7adc3830
      David S. Miller 提交于
      If all of the entropy is in the local and foreign addresses,
      but xor'ing together would cancel out that entropy, the
      current hash performs poorly.
      
      Suggested by Cosmin Ratiu:
      
      	Basically, the situation is as follows: There is a client
      	machine and a server machine. Both create 15000 virtual
      	interfaces, open up a socket for each pair of interfaces and
      	do SIP traffic. By profiling I noticed that there is a lot of
      	time spent walking the established hash chains with this
      	particular setup.
      
      	The addresses were distributed like this: client interfaces
      	were 198.18.0.1/16 with increments of 1 and server interfaces
      	were 198.18.128.1/16 with increments of 1. As I said, there
      	were 15000 interfaces. Source and destination ports were 5060
      	for each connection.  So in this case, ports don't matter for
      	hashing purposes, and the bits from the address pairs used
      	cancel each other, meaning there are no differences in the
      	whole lot of pairs, so they all end up in the same hash chain.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7adc3830
  13. 29 2月, 2008 1 次提交
  14. 24 2月, 2008 1 次提交
  15. 19 2月, 2008 1 次提交
  16. 13 2月, 2008 4 次提交
  17. 08 2月, 2008 4 次提交
  18. 07 2月, 2008 4 次提交
  19. 06 2月, 2008 1 次提交
  20. 05 2月, 2008 4 次提交
  21. 03 2月, 2008 4 次提交
    • A
      [IPV6]: Reorg struct ifmcaddr6 to save some bytes · 246f19d1
      Arnaldo Carvalho de Melo 提交于
      /home/acme/git/net-2.6/net/ipv6/mcast.c:
        struct ifmcaddr6 |   -8
       1 struct changed
        igmp6_group_dropped  |   -6
        add_grec             |   -3
        mld_ifc_timer_expire |  -18
        ip6_mc_add_src       |   -3
        ip6_mc_del_src       |   -3
        igmp6_group_added    |   -3
       6 functions changed, 36 bytes removed, diff: -36
      
      ipv6.ko:
       6 functions changed, 36 bytes removed, diff: -36
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      246f19d1
    • A
      [INET_TIMEWAIT_SOCK]: Reorganize struct inet_timewait_sock to save some bytes · ad8bb780
      Arnaldo Carvalho de Melo 提交于
      /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c:
        struct inet_timewait_sock |   -8
        struct tcp_timewait_sock  |   -8
       2 structs changed
        tcp_v6_rcv                |   -6
       1 function changed, 6 bytes removed, diff: -6
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad8bb780
    • A
      [INET6]: Reorganize struct inet6_dev to save 8 bytes · 4e7e5cfe
      Arnaldo Carvalho de Melo 提交于
      And make it a multiple of a 64 bytes, reducing cacheline trashing:
      
      Before:
      
      [acme@doppio net-2.6]$ pahole -C inet6_dev net/dccp/ipv6.o
      struct inet6_dev {
      	<SNIP>
      	long unsigned int          mc_maxdelay;          /*    48     8 */
      	unsigned char              mc_qrv;               /*    56     1 */
      	unsigned char              mc_gq_running;        /*    57     1 */
      	unsigned char              mc_ifc_count;         /*    58     1 */
      
      	/* XXX 5 bytes hole, try to pack */
      
      	/* --- cacheline 1 boundary (64 bytes) --- */
      	struct timer_list          mc_gq_timer;          /*    64    48 */
      	<SNIP>
      	__u32                      if_flags;             /*   180     4 */
      	int                        dead;                 /*   184     4 */
      	u8                         rndid[8];             /*   188     8 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	/* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */
      	struct timer_list          regen_timer;          /*   200    48 */
      
      	<SNIP>
      
      	/* size: 456, cachelines: 8 */
      	/* sum members: 447, holes: 2, sum holes: 9 */
      	/* last cacheline: 8 bytes */
      };
      
      After:
      
      net-2.6/net/ipv6/af_inet6.c:
        struct inet6_dev |   -8
       1 struct changed
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e7e5cfe
    • A
      [SOCK] proto: Add hashinfo member to struct proto · ab1e0a13
      Arnaldo Carvalho de Melo 提交于
      This way we can remove TCP and DCCP specific versions of
      
      sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port
      sk->sk_prot->hash:     inet_hash is directly used, only v6 need
                             a specific version to deal with mapped sockets
      sk->sk_prot->unhash:   both v4 and v6 use inet_hash directly
      
      struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so
      that inet_csk_get_port can find the per family routine.
      
      Now only the lookup routines receive as a parameter a struct inet_hashtable.
      
      With this we further reuse code, reducing the difference among INET transport
      protocols.
      
      Eventually work has to be done on UDP and SCTP to make them share this
      infrastructure and get as a bonus inet_diag interfaces so that iproute can be
      used with these protocols.
      
      net-2.6/net/ipv4/inet_hashtables.c:
        struct proto			     |   +8
        struct inet_connection_sock_af_ops |   +8
       2 structs changed
        __inet_hash_nolisten               |  +18
        __inet_hash                        | -210
        inet_put_port                      |   +8
        inet_bind_bucket_create            |   +1
        __inet_hash_connect                |   -8
       5 functions changed, 27 bytes added, 218 bytes removed, diff: -191
      
      net-2.6/net/core/sock.c:
        proto_seq_show                     |   +3
       1 function changed, 3 bytes added, diff: +3
      
      net-2.6/net/ipv4/inet_connection_sock.c:
        inet_csk_get_port                  |  +15
       1 function changed, 15 bytes added, diff: +15
      
      net-2.6/net/ipv4/tcp.c:
        tcp_set_state                      |   -7
       1 function changed, 7 bytes removed, diff: -7
      
      net-2.6/net/ipv4/tcp_ipv4.c:
        tcp_v4_get_port                    |  -31
        tcp_v4_hash                        |  -48
        tcp_v4_destroy_sock                |   -7
        tcp_v4_syn_recv_sock               |   -2
        tcp_unhash                         | -179
       5 functions changed, 267 bytes removed, diff: -267
      
      net-2.6/net/ipv6/inet6_hashtables.c:
        __inet6_hash |   +8
       1 function changed, 8 bytes added, diff: +8
      
      net-2.6/net/ipv4/inet_hashtables.c:
        inet_unhash                        | +190
        inet_hash                          | +242
       2 functions changed, 432 bytes added, diff: +432
      
      vmlinux:
       16 functions changed, 485 bytes added, 492 bytes removed, diff: -7
      
      /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c:
        tcp_v6_get_port                    |  -31
        tcp_v6_hash                        |   -7
        tcp_v6_syn_recv_sock               |   -9
       3 functions changed, 47 bytes removed, diff: -47
      
      /home/acme/git/net-2.6/net/dccp/proto.c:
        dccp_destroy_sock                  |   -7
        dccp_unhash                        | -179
        dccp_hash                          |  -49
        dccp_set_state                     |   -7
        dccp_done                          |   +1
       5 functions changed, 1 bytes added, 242 bytes removed, diff: -241
      
      /home/acme/git/net-2.6/net/dccp/ipv4.c:
        dccp_v4_get_port                   |  -31
        dccp_v4_request_recv_sock          |   -2
       2 functions changed, 33 bytes removed, diff: -33
      
      /home/acme/git/net-2.6/net/dccp/ipv6.c:
        dccp_v6_get_port                   |  -31
        dccp_v6_hash                       |   -7
        dccp_v6_request_recv_sock          |   +5
       3 functions changed, 5 bytes added, 38 bytes removed, diff: -33
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab1e0a13
  22. 01 2月, 2008 2 次提交