1. 02 7月, 2014 2 次提交
    • E
      inet: move ipv6only in sock_common · 9fe516ba
      Eric Dumazet 提交于
      When an UDP application switches from AF_INET to AF_INET6 sockets, we
      have a small performance degradation for IPv4 communications because of
      extra cache line misses to access ipv6only information.
      
      This can also be noticed for TCP listeners, as ipv6_only_sock() is also
      used from __inet_lookup_listener()->compute_score()
      
      This is magnified when SO_REUSEPORT is used.
      
      Move ipv6only into struct sock_common so that it is available at
      no extra cost in lookups.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fe516ba
    • B
      ipv6: Allow accepting RA from local IP addresses. · d9333196
      Ben Greear 提交于
      This can be used in virtual networking applications, and
      may have other uses as well.  The option is disabled by
      default.
      
      A specific use case is setting up virtual routers, bridges, and
      hosts on a single OS without the use of network namespaces or
      virtual machines.  With proper use of ip rules, routing tables,
      veth interface pairs and/or other virtual interfaces,
      and applications that can bind to interfaces and/or IP addresses,
      it is possibly to create one or more virtual routers with multiple
      hosts attached.  The host interfaces can act as IPv6 systems,
      with radvd running on the ports in the virtual routers.  With the
      option provided in this patch enabled, those hosts can now properly
      obtain IPv6 addresses from the radvd.
      Signed-off-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9333196
  2. 28 6月, 2014 1 次提交
  3. 20 1月, 2014 2 次提交
  4. 19 12月, 2013 1 次提交
  5. 10 12月, 2013 2 次提交
  6. 06 12月, 2013 1 次提交
  7. 29 10月, 2013 1 次提交
  8. 10 10月, 2013 1 次提交
    • E
      inet: includes a sock_common in request_sock · 634fb979
      Eric Dumazet 提交于
      TCP listener refactoring, part 5 :
      
      We want to be able to insert request sockets (SYN_RECV) into main
      ehash table instead of the per listener hash table to allow RCU
      lookups and remove listener lock contention.
      
      This patch includes the needed struct sock_common in front
      of struct request_sock
      
      This means there is no more inet6_request_sock IPv6 specific
      structure.
      
      Following inet_request_sock fields were renamed as they became
      macros to reference fields from struct sock_common.
      Prefix ir_ was chosen to avoid name collisions.
      
      loc_port   -> ir_loc_port
      loc_addr   -> ir_loc_addr
      rmt_addr   -> ir_rmt_addr
      rmt_port   -> ir_rmt_port
      iif        -> ir_iif
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      634fb979
  9. 09 10月, 2013 1 次提交
    • E
      ipv6: make lookups simpler and faster · efe4208f
      Eric Dumazet 提交于
      TCP listener refactoring, part 4 :
      
      To speed up inet lookups, we moved IPv4 addresses from inet to struct
      sock_common
      
      Now is time to do the same for IPv6, because it permits us to have fast
      lookups for all kind of sockets, including upcoming SYN_RECV.
      
      Getting IPv6 addresses in TCP lookups currently requires two extra cache
      lines, plus a dereference (and memory stall).
      
      inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6
      
      This patch is way bigger than its IPv4 counter part, because for IPv4,
      we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
      it's not doable easily.
      
      inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
      inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr
      
      And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
      at the same offset.
      
      We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
      macro.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efe4208f
  10. 04 10月, 2013 1 次提交
    • E
      inet: consolidate INET_TW_MATCH · 50805466
      Eric Dumazet 提交于
      TCP listener refactoring, part 2 :
      
      We can use a generic lookup, sockets being in whatever state, if
      we are sure all relevant fields are at the same place in all socket
      types (ESTABLISH, TIME_WAIT, SYN_RECV)
      
      This patch removes these macros :
      
       inet_addrpair, inet_addrpair, tw_addrpair, tw_portpair
      
      And adds :
      
       sk_portpair, sk_addrpair, sk_daddr, sk_rcv_saddr
      
      Then, INET_TW_MATCH() is really the same than INET_MATCH()
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50805466
  11. 30 8月, 2013 1 次提交
  12. 20 8月, 2013 1 次提交
  13. 14 8月, 2013 1 次提交
    • H
      ipv6: make unsolicited report intervals configurable for mld · fc4eba58
      Hannes Frederic Sowa 提交于
      Commit cab70040 ("net: igmp:
      Reduce Unsolicited report interval to 1s when using IGMPv3") and
      2690048c ("net: igmp: Allow user-space
      configuration of igmp unsolicited report interval") by William Manley made
      igmp unsolicited report intervals configurable per interface and corrected
      the interval of unsolicited igmpv3 report messages resendings to 1s.
      
      Same needs to be done for IPv6:
      
      MLDv1 (RFC2710 7.10.): 10 seconds
      MLDv2 (RFC3810 9.11.): 1 second
      
      Both intervals are configurable via new procfs knobs
      mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.
      
      (also added .force_mld_version to ipv6_devconf_dflt to bring structs in
      line without semantic changes)
      
      v2:
      a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
         unsolicited_report_interval procfs knobs.
      b) incorporate stylistic feedback from William Manley
      
      v3:
      a) add new DEVCONF_* values to the end of the enum (thanks to David
         Miller)
      
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: William Manley <william.manley@youview.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc4eba58
  14. 31 1月, 2013 1 次提交
  15. 14 1月, 2013 2 次提交
  16. 09 12月, 2012 1 次提交
  17. 01 12月, 2012 1 次提交
    • E
      net: move inet_dport/inet_num in sock_common · ce43b03e
      Eric Dumazet 提交于
      commit 68835aba (net: optimize INET input path further)
      moved some fields used for tcp/udp sockets lookup in the first cache
      line of struct sock_common.
      
      This patch moves inet_dport/inet_num as well, filling a 32bit hole
      on 64 bit arches and reducing number of cache line misses in lookups.
      
      Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
      before addresses match, as this check is more discriminant.
      
      Remove the hash check from MATCH() macros because we dont need to
      re validate the hash value after taking a refcount on socket, and
      use likely/unlikely compiler hints, as the sk_hash/hash check
      makes the following conditional tests 100% predicted by cpu.
      
      Introduce skc_addrpair/skc_portpair pair values to better
      document the alignment requirements of the port/addr pairs
      used in the various MATCH() macros, and remove some casts.
      
      The namespace check can also be done at last.
      
      This slightly improves TCP/UDP lookup times.
      
      IP/TCP early demux needs inet->rx_dst_ifindex and
      TCP needs inet->min_ttl, lets group them together in same cache line.
      
      With help from Ben Hutchings & Joe Perches.
      
      Idea of this patch came after Ling Ma proposal to move skc_hash
      to the beginning of struct sock_common, and should allow him
      to submit a final version of his patch. My tests show an improvement
      doing so.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ling Ma <ling.ma.program@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce43b03e
  18. 14 11月, 2012 1 次提交
  19. 13 10月, 2012 1 次提交
  20. 30 8月, 2012 1 次提交
    • P
      netfilter: nf_conntrack_ipv6: improve fragmentation handling · 4cdd3408
      Patrick McHardy 提交于
      The IPv6 conntrack fragmentation currently has a couple of shortcomings.
      Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the
      defragmented packet is then passed to conntrack, the resulting conntrack
      information is attached to each original fragment and the fragments then
      continue their way through the stack.
      
      Helper invocation occurs in the POSTROUTING hook, at which point only
      the original fragments are available. The result of this is that
      fragmented packets are never passed to helpers.
      
      This patch improves the situation in the following way:
      
      - If a reassembled packet belongs to a connection that has a helper
        assigned, the reassembled packet is passed through the stack instead
        of the original fragments.
      
      - During defragmentation, the largest received fragment size is stored.
        On output, the packet is refragmented if required. If the largest
        received fragment size exceeds the outgoing MTU, a "packet too big"
        message is generated, thus behaving as if the original fragments
        were passed through the stack from an outside point of view.
      
      - The ipv6_helper() hook function can't receive fragments anymore for
        connections using a helper, so it is switched to use ipv6_skip_exthdr()
        instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the
        reassembled packets are passed to connection tracking helpers.
      
      The result of this is that we can properly track fragmented packets, but
      still generate ICMPv6 Packet too big messages if we would have before.
      
      This patch is also required as a precondition for IPv6 NAT, where NAT
      helpers might enlarge packets up to a point that they require
      fragmentation. In that case we can't generate Packet too big messages
      since the proper MTU can't be calculated in all cases (f.i. when
      changing textual representation of a variable amount of addresses),
      so the packet is transparently fragmented iff the original packet or
      fragments would have fit the outgoing MTU.
      
      IPVS parts by Jesper Dangaard Brouer <brouer@redhat.com>.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      4cdd3408
  21. 07 8月, 2012 1 次提交
  22. 18 7月, 2012 1 次提交
  23. 11 7月, 2012 1 次提交
  24. 13 2月, 2012 2 次提交
  25. 09 2月, 2012 1 次提交
  26. 12 12月, 2011 1 次提交
  27. 25 11月, 2010 1 次提交
  28. 21 10月, 2010 1 次提交
  29. 23 8月, 2010 1 次提交
  30. 20 7月, 2010 1 次提交
    • D
      ipv6: Make IP6CB(skb)->nhoff 16-bit. · e7c38157
      David S. Miller 提交于
      Even with jumbograms I cannot see any way in which we would need
      to records a larger than 65535 valued next-header offset.
      
      The maximum extension header length is (256 << 3) == 2048.
      There are only a handful of extension headers specified which
      we'd even accept (say 5 or 6), therefore the largest next-header
      offset we'd ever have to contend with is something less than
      say 16k.
      
      Therefore make it a u16 instead of a u32.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7c38157
  31. 03 6月, 2010 1 次提交
  32. 11 5月, 2010 1 次提交
    • P
      ipv6: ip6mr: support multiple tables · d1db275d
      Patrick McHardy 提交于
      This patch adds support for multiple independant multicast routing instances,
      named "tables".
      
      Userspace multicast routing daemons can bind to a specific table instance by
      issuing a setsockopt call using a new option MRT6_TABLE. The table number is
      stored in the raw socket data and affects all following ip6mr setsockopt(),
      getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
      is created with a default routing rule pointing to it. Newly created pim6reg
      devices have the table number appended ("pim6regX"), with the exception of
      devices created in the default table, which are named just "pim6reg" for
      compatibility reasons.
      
      Packets are directed to a specific table instance using routing rules,
      similar to how regular routing rules work. Currently iif, oif and mark
      are supported as keys, source and destination addresses could be supported
      additionally.
      
      Example usage:
      
      - bind pimd/xorp/... to a specific table:
      
      uint32_t table = 123;
      setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));
      
      - create routing rules directing packets to the new table:
      
      # ip -6 mrule add iif eth0 lookup 123
      # ip -6 mrule add oif eth0 lookup 123
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      d1db275d
  33. 24 4月, 2010 2 次提交
  34. 23 4月, 2010 1 次提交
    • S
      IPv6: Generic TTL Security Mechanism (final version) · e802af9c
      Stephen Hemminger 提交于
      This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism.  
      
      Not to users of mapped address; the IPV6 and IPV4 socket options are seperate.
      The server does have to deal with both IPv4 and IPv6 socket options
      and the client has to handle the different for each family.
      
      On client:
      	int ttl = 255;
      	getaddrinfo(argv[1], argv[2], &hint, &result);
      
      	for (rp = result; rp != NULL; rp = rp->ai_next) {
      		s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
      		if (s < 0) continue;
      
      		if (rp->ai_family == AF_INET) {
      			setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
      		} else if (rp->ai_family == AF_INET6) {
      			setsockopt(s, IPPROTO_IPV6,  IPV6_UNICAST_HOPS, 
      					&ttl, sizeof(ttl)))
      		}
      			
      		if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) {
      		   ...
      
      On server:
      	int minttl = 255 - maxhops;
         
      	getaddrinfo(NULL, port, &hints, &result);
      	for (rp = result; rp != NULL; rp = rp->ai_next) {
      		s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
      		if (s < 0) continue;
      
      		if (rp->ai_family == AF_INET6)
      			setsockopt(s, IPPROTO_IPV6,  IPV6_MINHOPCOUNT,
      					&minttl, sizeof(minttl));
      		setsockopt(s, IPPROTO_IP, IP_MINTTL, &minttl, sizeof(minttl));
      			
      		if (bind(s, rp->ai_addr, rp->ai_addrlen) == 0)
      			break
      ...
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e802af9c