1. 02 7月, 2014 1 次提交
    • E
      inet: move ipv6only in sock_common · 9fe516ba
      Eric Dumazet 提交于
      When an UDP application switches from AF_INET to AF_INET6 sockets, we
      have a small performance degradation for IPv4 communications because of
      extra cache line misses to access ipv6only information.
      
      This can also be noticed for TCP listeners, as ipv6_only_sock() is also
      used from __inet_lookup_listener()->compute_score()
      
      This is magnified when SO_REUSEPORT is used.
      
      Move ipv6only into struct sock_common so that it is available at
      no extra cost in lookups.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fe516ba
  2. 24 5月, 2014 1 次提交
  3. 08 5月, 2014 1 次提交
    • W
      net: clean up snmp stats code · 698365fa
      WANG Cong 提交于
      commit 8f0ea0fe (snmp: reduce percpu needs by 50%)
      reduced snmp array size to 1, so technically it doesn't have to be
      an array any more. What's more, after the following commit:
      
      	commit 933393f5
      	Date:   Thu Dec 22 11:58:51 2011 -0600
      
      	    percpu: Remove irqsafe_cpu_xxx variants
      
      	    We simply say that regular this_cpu use must be safe regardless of
      	    preemption and interrupt state.  That has no material change for x86
      	    and s390 implementations of this_cpu operations.  However, arches that
      	    do not provide their own implementation for this_cpu operations will
      	    now get code generated that disables interrupts instead of preemption.
      
      probably no arch wants to have SNMP_ARRAY_SZ == 2. At least after
      almost 3 years, no one complains.
      
      So, just convert the array to a single pointer and remove snmp_mib_init()
      and snmp_mib_free() as well.
      
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      698365fa
  4. 20 1月, 2014 1 次提交
  5. 19 12月, 2013 1 次提交
  6. 10 12月, 2013 1 次提交
  7. 06 12月, 2013 1 次提交
  8. 19 11月, 2013 1 次提交
  9. 06 11月, 2013 1 次提交
    • J
      net: Explicitly initialize u64_stats_sync structures for lockdep · 827da44c
      John Stultz 提交于
      In order to enable lockdep on seqcount/seqlock structures, we
      must explicitly initialize any locks.
      
      The u64_stats_sync structure, uses a seqcount, and thus we need
      to introduce a u64_stats_init() function and use it to initialize
      the structure.
      
      This unfortunately adds a lot of fairly trivial initialization code
      to a number of drivers. But the benefit of ensuring correctness makes
      this worth while.
      
      Because these changes are required for lockdep to be enabled, and the
      changes are quite trivial, I've not yet split this patch out into 30-some
      separate patches, as I figured it would be better to get the various
      maintainers thoughts on how to best merge this change along with
      the seqcount lockdep enablement.
      
      Feedback would be appreciated!
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Mirko Lindner <mlindner@marvell.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Roger Luethi <rl@hellgate.ch>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Wensong Zhang <wensong@linux-vs.org>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      827da44c
  10. 22 10月, 2013 1 次提交
  11. 20 10月, 2013 1 次提交
  12. 09 10月, 2013 1 次提交
    • E
      ipv6: make lookups simpler and faster · efe4208f
      Eric Dumazet 提交于
      TCP listener refactoring, part 4 :
      
      To speed up inet lookups, we moved IPv4 addresses from inet to struct
      sock_common
      
      Now is time to do the same for IPv6, because it permits us to have fast
      lookups for all kind of sockets, including upcoming SYN_RECV.
      
      Getting IPv6 addresses in TCP lookups currently requires two extra cache
      lines, plus a dereference (and memory stall).
      
      inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6
      
      This patch is way bigger than its IPv4 counter part, because for IPv4,
      we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
      it's not doable easily.
      
      inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
      inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr
      
      And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
      at the same offset.
      
      We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
      macro.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efe4208f
  13. 24 9月, 2013 1 次提交
    • C
      ipv6: do not allow ipv6 module to be removed · 8ce44061
      Cong Wang 提交于
      There was some bug report on ipv6 module removal path before.
      Also, as Stephen pointed out, after vxlan module gets ipv6 support,
      the ipv6 stub it used is not safe against this module removal either.
      So, let's just remove inet6_exit() so that ipv6 module will not be
      able to be unloaded.
      
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ce44061
  14. 12 9月, 2013 1 次提交
    • M
      ipv6: don't call fib6_run_gc() until routing is ready · 2c861cc6
      Michal Kubeček 提交于
      When loading the ipv6 module, ndisc_init() is called before
      ip6_route_init(). As the former registers a handler calling
      fib6_run_gc(), this opens a window to run the garbage collector
      before necessary data structures are initialized. If a network
      device is initialized in this window, adding MAC address to it
      triggers a NETDEV_CHANGEADDR event, leading to a crash in
      fib6_clean_all().
      
      Take the event handler registration out of ndisc_init() into a
      separate function ndisc_late_init() and move it after
      ip6_route_init().
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c861cc6
  15. 01 9月, 2013 3 次提交
  16. 01 8月, 2013 1 次提交
  17. 26 5月, 2013 1 次提交
    • L
      net: ipv6: Add IPv6 support to the ping socket. · 6d0bfe22
      Lorenzo Colitti 提交于
      This adds the ability to send ICMPv6 echo requests without a
      raw socket. The equivalent ability for ICMPv4 was added in
      2011.
      
      Instead of having separate code paths for IPv4 and IPv6, make
      most of the code in net/ipv4/ping.c dual-stack and only add a
      few IPv6-specific bits (like the protocol definition) to a new
      net/ipv6/ping.c. Hopefully this will reduce divergence and/or
      duplication of bugs in the future.
      
      Caveats:
      
      - Setting options via ancillary data (e.g., using IPV6_PKTINFO
        to specify the outgoing interface) is not yet supported.
      - There are no separate security settings for IPv4 and IPv6;
        everything is controlled by /proc/net/ipv4/ping_group_range.
      - The proc interface does not yet display IPv6 ping sockets
        properly.
      
      Tested with a patched copy of ping6 and using raw socket calls.
      Compiles and works with all of CONFIG_IPV6={n,m,y}.
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d0bfe22
  18. 27 3月, 2013 1 次提交
    • P
      GRE: Refactor GRE tunneling code. · c5441932
      Pravin B Shelar 提交于
      Following patch refactors GRE code into ip tunneling code and GRE
      specific code. Common tunneling code is moved to ip_tunnel module.
      ip_tunnel module is written as generic library which can be used
      by different tunneling implementations.
      
      ip_tunnel module contains following components:
       - packet xmit and rcv generic code. xmit flow looks like
         (gre_xmit/ipip_xmit)->ip_tunnel_xmit->ip_local_out.
       - hash table of all devices.
       - lookup for tunnel devices.
       - control plane operations like device create, destroy, ioctl, netlink
         operations code.
       - registration for tunneling modules, like gre, ipip etc.
       - define single pcpu_tstats dev->tstats.
       - struct tnl_ptk_info added to pass parsed tunnel packet parameters.
      
      ipip.h header is renamed to ip_tunnel.h
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5441932
  19. 09 3月, 2013 1 次提交
  20. 10 1月, 2013 1 次提交
  21. 19 11月, 2012 2 次提交
    • E
      net: Make CAP_NET_BIND_SERVICE per user namespace · 3594698a
      Eric W. Biederman 提交于
      Allow privileged users in any user namespace to bind to
      privileged sockets in network namespaces they control.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3594698a
    • E
      net: Allow userns root to control ipv6 · af31f412
      Eric W. Biederman 提交于
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Settings that merely control a single network device are allowed.
      Either the network device is a logical network device where
      restrictions make no difference or the network device is hardware NIC
      that has been explicity moved from the initial network namespace.
      
      In general policy and network stack state changes are allowed while
      resource control is left unchanged.
      
      Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
      Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
      Allow the SIOCADDRT ioctl to add ipv6 routes.
      Allow the SIOCDELRT ioctl to delete ipv6 routes.
      
      Allow creation of ipv6 raw sockets.
      
      Allow setting the IPV6_JOIN_ANYCAST socket option.
      Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
      socket option.
      
      Allow setting the IPV6_TRANSPARENT socket option.
      Allow setting the IPV6_HOPOPTS socket option.
      Allow setting the IPV6_RTHDRDSTOPTS socket option.
      Allow setting the IPV6_DSTOPTS socket option.
      Allow setting the IPV6_IPSEC_POLICY socket option.
      Allow setting the IPV6_XFRM_POLICY socket option.
      
      Allow sending packets with the IPV6_2292HOPOPTS control message.
      Allow sending packets with the IPV6_2292DSTOPTS control message.
      Allow sending packets with the IPV6_RTHDRDSTOPTS control message.
      
      Allow setting the multicast routing socket options on non multicast
      routing sockets.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
      setting up, changing and deleting tunnels over ipv6.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
      setting up, changing and deleting ipv6 over ipv4 tunnels.
      
      Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
      deleting, and changing the potential router list for ISATAP tunnels.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af31f412
  22. 16 11月, 2012 4 次提交
  23. 09 10月, 2012 1 次提交
    • E
      ipv6: gro: fix PV6_GRO_CB(skb)->proto problem · 86347245
      Eric Dumazet 提交于
      It seems IPV6_GRO_CB(skb)->proto can be destroyed in skb_gro_receive()
      if a new skb is allocated (to serve as an anchor for frag_list)
      
      We copy NAPI_GRO_CB() only (not the IPV6 specific part) in :
      
      *NAPI_GRO_CB(nskb) = *NAPI_GRO_CB(p);
      
      So we leave IPV6_GRO_CB(nskb)->proto to 0 (fresh skb allocation) instead
      of IPPROTO_TCP (6)
      
      ipv6_gro_complete() isnt able to call ops->gro_complete()
      [ tcp6_gro_complete() ]
      
      Fix this by moving proto in NAPI_GRO_CB() and getting rid of
      IPV6_GRO_CB
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86347245
  24. 08 10月, 2012 1 次提交
  25. 18 5月, 2012 1 次提交
  26. 16 5月, 2012 1 次提交
  27. 12 5月, 2012 1 次提交
    • E
      net/ipv6/af_inet6.c: checkpatch cleanup · 647c0c70
      Eldad Zack 提交于
      af_inet6.c:80: ERROR: do not initialise statics to 0 or NULL
      af_inet6.c:259: ERROR: spaces required around that '=' (ctx:VxV)
      af_inet6.c:394: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:412: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:422: ERROR: do not use assignment in if condition
      af_inet6.c:425: ERROR: do not use assignment in if condition
      af_inet6.c:433: ERROR: do not use assignment in if condition
      af_inet6.c:437: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:446: ERROR: spaces required around that '=' (ctx:VxV)
      af_inet6.c:478: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:485: ERROR: that open brace { should be on the previous line
      af_inet6.c:485: ERROR: space required before the open parenthesis '('
      af_inet6.c:513: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:629: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:647: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:687: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:709: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
      af_inet6.c:1073: ERROR: space required before the open parenthesis '('
      Signed-off-by: NEldad Zack <eldad@fogrefinery.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      647c0c70
  28. 22 4月, 2012 1 次提交
  29. 21 4月, 2012 1 次提交
  30. 29 3月, 2012 1 次提交
  31. 13 2月, 2012 1 次提交
    • J
      net: implement IP_RECVTOS for IP_PKTOPTIONS · 4c507d28
      Jiri Benc 提交于
      Currently, it is not easily possible to get TOS/DSCP value of packets from
      an incoming TCP stream. The mechanism is there, IP_PKTOPTIONS getsockopt
      with IP_RECVTOS set, the same way as incoming TTL can be queried. This is
      not actually implemented for TOS, though.
      
      This patch adds this functionality, both for IPv4 (IP_PKTOPTIONS) and IPv6
      (IPV6_2292PKTOPTIONS). For IPv4, like in the IP_RECVTTL case, the value of
      the TOS field is stored from the other party's ACK.
      
      This is needed for proxies which require DSCP transparency. One such example
      is at http://zph.bratcheda.org/.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c507d28
  32. 13 12月, 2011 1 次提交
  33. 23 11月, 2011 1 次提交
  34. 17 11月, 2011 1 次提交