1. 08 1月, 2013 1 次提交
  2. 27 12月, 2012 1 次提交
  3. 20 12月, 2012 1 次提交
  4. 17 12月, 2012 5 次提交
    • H
      netfilter: nf_ct_reasm: fix conntrack reassembly expire code · 97cf00e9
      Haibo Xi 提交于
      Commit b836c99f (ipv6: unify conntrack reassembly expire
      code with standard one) use the standard IPv6 reassembly
      code(ip6_expire_frag_queue) to handle conntrack reassembly expire.
      
      In ip6_expire_frag_queue, it invoke dev_get_by_index_rcu to get
      which device received this expired packet.so we must save ifindex
      when NF_conntrack get this packet.
      
      With this patch applied, I can see ICMP Time Exceeded sent
      from the receiver when the sender sent out 1/2 fragmented
      IPv6 packet.
      Signed-off-by: NHaibo Xi <haibbo@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      97cf00e9
    • F
      netfilter: nf_conntrack_ipv6: fix comment for packets without data · d7a769ff
      Florent Fourcot 提交于
      Remove ambiguity of double negation.
      Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr>
      Acked-by: NRick Jones <rick.jones2@hp.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d7a769ff
    • A
      netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE · c65ef8dc
      Andrew Collins 提交于
      Since (a0ecb85a netfilter: nf_nat: Handle routing changes in MASQUERADE
      target), the MASQUERADE target handles routing changes which affect
      the output interface of a connection, but only for ESTABLISHED
      connections.  It is also possible for NEW connections which
      already have a conntrack entry to be affected by routing changes.
      
      This adds a check to drop entries in the NEW+conntrack state
      when the oif has changed.
      Signed-off-by: NAndrew Collins <bsderandrew@gmail.com>
      Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c65ef8dc
    • M
      netfilter: ip[6]t_REJECT: fix wrong transport header pointer in TCP reset · c6f40899
      Mukund Jampala 提交于
      The problem occurs when iptables constructs the tcp reset packet.
      It doesn't initialize the pointer to the tcp header within the skb.
      When the skb is passed to the ixgbe driver for transmit, the ixgbe
      driver attempts to access the tcp header and crashes.
      Currently, other drivers (such as our 1G e1000e or igb drivers) don't
      access the tcp header on transmit unless the TSO option is turned on.
      
      <1>BUG: unable to handle kernel NULL pointer dereference at 0000000d
      <1>IP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe]
      <4>*pdpt = 0000000085e5d001 *pde = 0000000000000000
      <0>Oops: 0000 [#1] SMP
      [...]
      <4>Pid: 0, comm: swapper Tainted: P            2.6.35.12 #1 Greencity/Thurley
      <4>EIP: 0060:[<d081621c>] EFLAGS: 00010246 CPU: 16
      <4>EIP is at ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe]
      <4>EAX: c7628820 EBX: 00000007 ECX: 00000000 EDX: 00000000
      <4>ESI: 00000008 EDI: c6882180 EBP: dfc6b000 ESP: ced95c48
      <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
      <0>Process swapper (pid: 0, ti=ced94000 task=ced73bd0 task.ti=ced94000)
      <0>Stack:
      <4> cbec7418 c779e0d8 c77cc888 c77cc8a8 0903010a 00000000 c77c0008 00000002
      <4><0> cd4997c0 00000010 dfc6b000 00000000 d0d176c9 c77cc8d8 c6882180 cbec7318
      <4><0> 00000004 00000004 cbec7230 cbec7110 00000000 cbec70c0 c779e000 00000002
      <0>Call Trace:
      <4> [<d0d176c9>] ? 0xd0d176c9
      <4> [<d0d18a4d>] ? 0xd0d18a4d
      <4> [<411e243e>] ? dev_hard_start_xmit+0x218/0x2d7
      <4> [<411f03d7>] ? sch_direct_xmit+0x4b/0x114
      <4> [<411f056a>] ? __qdisc_run+0xca/0xe0
      <4> [<411e28b0>] ? dev_queue_xmit+0x2d1/0x3d0
      <4> [<411e8120>] ? neigh_resolve_output+0x1c5/0x20f
      <4> [<411e94a1>] ? neigh_update+0x29c/0x330
      <4> [<4121cf29>] ? arp_process+0x49c/0x4cd
      <4> [<411f80c9>] ? nf_hook_slow+0x3f/0xac
      <4> [<4121ca8d>] ? arp_process+0x0/0x4cd
      <4> [<4121ca8d>] ? arp_process+0x0/0x4cd
      <4> [<4121c6d5>] ? T.901+0x38/0x3b
      <4> [<4121c918>] ? arp_rcv+0xa3/0xb4
      <4> [<4121ca8d>] ? arp_process+0x0/0x4cd
      <4> [<411e1173>] ? __netif_receive_skb+0x32b/0x346
      <4> [<411e19e1>] ? netif_receive_skb+0x5a/0x5f
      <4> [<411e1ea9>] ? napi_skb_finish+0x1b/0x30
      <4> [<d0816eb4>] ? ixgbe_xmit_frame_ring+0x1564/0x2260 [ixgbe]
      <4> [<41013468>] ? lapic_next_event+0x13/0x16
      <4> [<410429b2>] ? clockevents_program_event+0xd2/0xe4
      <4> [<411e1b03>] ? net_rx_action+0x55/0x127
      <4> [<4102da1a>] ? __do_softirq+0x77/0xeb
      <4> [<4102dab1>] ? do_softirq+0x23/0x27
      <4> [<41003a67>] ? do_IRQ+0x7d/0x8e
      <4> [<41002a69>] ? common_interrupt+0x29/0x30
      <4> [<41007bcf>] ? mwait_idle+0x48/0x4d
      <4> [<4100193b>] ? cpu_idle+0x37/0x4c
      <0>Code: df 09 d7 0f 94 c2 0f b6 d2 e9 e7 fb ff ff 31 db 31 c0 e9 38
      ff ff ff 80 78 06 06 0f 85 3e fb ff ff 8b 7c 24 38 8b 8f b8 00 00 00
      <0f> b6 51 0d f6 c2 01 0f 85 27 fb ff ff 80 e2 02 75 0d 8b 6c 24
      <0>EIP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] SS:ESP
      Signed-off-by: NMukund Jampala <jbmukund@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c6f40899
    • S
      ipv6: Fix Makefile offload objects · df484191
      Simon Arlott 提交于
      The following commit breaks IPv6 TCP transmission for me:
      	Commit 75fe83c3
      	Author: Vlad Yasevich <vyasevic@redhat.com>
      	Date:   Fri Nov 16 09:41:21 2012 +0000
      	ipv6: Preserve ipv6 functionality needed by NET
      
      This patch fixes the typo "ipv6_offload" which should be
      "ipv6-offload".
      
      I don't know why not including the offload modules should
      break TCP. Disabling all offload options on the NIC didn't
      help. Outgoing pulseaudio traffic kept stalling.
      Signed-off-by: NSimon Arlott <simon@fire.lp0.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df484191
  5. 15 12月, 2012 2 次提交
    • C
      inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock · e337e24d
      Christoph Paasch 提交于
      If in either of the above functions inet_csk_route_child_sock() or
      __inet_inherit_port() fails, the newsk will not be freed:
      
      unreferenced object 0xffff88022e8a92c0 (size 1592):
        comm "softirq", pid 0, jiffies 4294946244 (age 726.160s)
        hex dump (first 32 bytes):
          0a 01 01 01 0a 01 01 02 00 00 00 00 a7 cc 16 00  ................
          02 00 03 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff8153d190>] kmemleak_alloc+0x21/0x3e
          [<ffffffff810ab3e7>] kmem_cache_alloc+0xb5/0xc5
          [<ffffffff8149b65b>] sk_prot_alloc.isra.53+0x2b/0xcd
          [<ffffffff8149b784>] sk_clone_lock+0x16/0x21e
          [<ffffffff814d711a>] inet_csk_clone_lock+0x10/0x7b
          [<ffffffff814ebbc3>] tcp_create_openreq_child+0x21/0x481
          [<ffffffff814e8fa5>] tcp_v4_syn_recv_sock+0x3a/0x23b
          [<ffffffff814ec5ba>] tcp_check_req+0x29f/0x416
          [<ffffffff814e8e10>] tcp_v4_do_rcv+0x161/0x2bc
          [<ffffffff814eb917>] tcp_v4_rcv+0x6c9/0x701
          [<ffffffff814cea9f>] ip_local_deliver_finish+0x70/0xc4
          [<ffffffff814cec20>] ip_local_deliver+0x4e/0x7f
          [<ffffffff814ce9f8>] ip_rcv_finish+0x1fc/0x233
          [<ffffffff814cee68>] ip_rcv+0x217/0x267
          [<ffffffff814a7bbe>] __netif_receive_skb+0x49e/0x553
          [<ffffffff814a7cc3>] netif_receive_skb+0x50/0x82
      
      This happens, because sk_clone_lock initializes sk_refcnt to 2, and thus
      a single sock_put() is not enough to free the memory. Additionally, things
      like xfrm, memcg, cookie_values,... may have been initialized.
      We have to free them properly.
      
      This is fixed by forcing a call to tcp_done(), ending up in
      inet_csk_destroy_sock, doing the final sock_put(). tcp_done() is necessary,
      because it ends up doing all the cleanup on xfrm, memcg, cookie_values,
      xfrm,...
      
      Before calling tcp_done, we have to set the socket to SOCK_DEAD, to
      force it entering inet_csk_destroy_sock. To avoid the warning in
      inet_csk_destroy_sock, inet_num has to be set to 0.
      As inet_csk_destroy_sock does a dec on orphan_count, we first have to
      increase it.
      
      Calling tcp_done() allows us to remove the calls to
      tcp_clear_xmit_timer() and tcp_cleanup_congestion_control().
      
      A similar approach is taken for dccp by calling dccp_done().
      
      This is in the kernel since 093d2823 (tproxy: fix hash locking issue
      when using port redirection in __inet_inherit_port()), thus since
      version >= 2.6.37.
      Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e337e24d
    • D
      ipv6: Change skb->data before using icmpv6_notify() to propagate redirect · 093d04d4
      Duan Jiong 提交于
      In function ndisc_redirect_rcv(), the skb->data points to the transport
      header, but function icmpv6_notify() need the skb->data points to the
      inner IP packet. So before using icmpv6_notify() to propagate redirect,
      change skb->data to point the inner IP packet that triggered the sending
      of the Redirect, and introduce struct rd_msg to make it easy.
      Signed-off-by: NDuan Jiong <djduanjiong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      093d04d4
  6. 14 12月, 2012 1 次提交
  7. 13 12月, 2012 1 次提交
  8. 06 12月, 2012 1 次提交
  9. 05 12月, 2012 8 次提交
  10. 04 12月, 2012 2 次提交
  11. 03 12月, 2012 1 次提交
  12. 02 12月, 2012 1 次提交
  13. 01 12月, 2012 1 次提交
    • E
      net: move inet_dport/inet_num in sock_common · ce43b03e
      Eric Dumazet 提交于
      commit 68835aba (net: optimize INET input path further)
      moved some fields used for tcp/udp sockets lookup in the first cache
      line of struct sock_common.
      
      This patch moves inet_dport/inet_num as well, filling a 32bit hole
      on 64 bit arches and reducing number of cache line misses in lookups.
      
      Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
      before addresses match, as this check is more discriminant.
      
      Remove the hash check from MATCH() macros because we dont need to
      re validate the hash value after taking a refcount on socket, and
      use likely/unlikely compiler hints, as the sk_hash/hash check
      makes the following conditional tests 100% predicted by cpu.
      
      Introduce skc_addrpair/skc_portpair pair values to better
      document the alignment requirements of the port/addr pairs
      used in the various MATCH() macros, and remove some casts.
      
      The namespace check can also be done at last.
      
      This slightly improves TCP/UDP lookup times.
      
      IP/TCP early demux needs inet->rx_dst_ifindex and
      TCP needs inet->min_ttl, lets group them together in same cache line.
      
      With help from Ben Hutchings & Joe Perches.
      
      Idea of this patch came after Ling Ma proposal to move skc_hash
      to the beginning of struct sock_common, and should allow him
      to submit a final version of his patch. My tests show an improvement
      doing so.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ling Ma <ling.ma.program@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce43b03e
  14. 29 11月, 2012 1 次提交
    • N
      ip6tnl/sit: drop packet if ECN present with not-ECT · f4e0b4c5
      Nicolas Dichtel 提交于
      This patch reports the change made by Stephen Hemminger in ipip and gre[6] in
      commit eccc1bb8 (tunnel: drop packet if ECN present with not-ECT).
      
      Goal is to handle RFC6040, Section 4.2:
      
      Default Tunnel Egress Behaviour.
       o If the inner ECN field is Not-ECT, the decapsulator MUST NOT
            propagate any other ECN codepoint onwards.  This is because the
            inner Not-ECT marking is set by transports that rely on dropped
            packets as an indication of congestion and would not understand or
            respond to any other ECN codepoint [RFC4774].  Specifically:
      
            *  If the inner ECN field is Not-ECT and the outer ECN field is
               CE, the decapsulator MUST drop the packet.
      
            *  If the inner ECN field is Not-ECT and the outer ECN field is
               Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the
               outgoing packet with the ECN field cleared to Not-ECT.
      
      The patch takes benefits from common function added in net/inet_ecn.h.
      
      Like it was done for Xin4 tunnels, it adds logging to allow detecting broken
      systems that set ECN bits incorrectly when tunneling (or an intermediate
      router might be changing the header). Errors are also tracked via
      rx_frame_error.
      
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4e0b4c5
  15. 27 11月, 2012 1 次提交
  16. 26 11月, 2012 1 次提交
  17. 23 11月, 2012 1 次提交
  18. 21 11月, 2012 2 次提交
  19. 19 11月, 2012 6 次提交
    • E
      net: Make CAP_NET_BIND_SERVICE per user namespace · 3594698a
      Eric W. Biederman 提交于
      Allow privileged users in any user namespace to bind to
      privileged sockets in network namespaces they control.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3594698a
    • E
      net: Enable a userns root rtnl calls that are safe for unprivilged users · b51642f6
      Eric W. Biederman 提交于
      - Only allow moving network devices to network namespaces you have
        CAP_NET_ADMIN privileges over.
      
      - Enable creating/deleting/modifying interfaces
      - Enable adding/deleting addresses
      - Enable adding/setting/deleting neighbour entries
      - Enable adding/removing routes
      - Enable adding/removing fib rules
      - Enable setting the forwarding state
      - Enable adding/removing ipv6 address labels
      - Enable setting bridge parameter
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b51642f6
    • E
      net: Enable some sysctls that are safe for the userns root · c027aab4
      Eric W. Biederman 提交于
      - Enable the per device ipv4 sysctls:
         net/ipv4/conf/<if>/forwarding
         net/ipv4/conf/<if>/mc_forwarding
         net/ipv4/conf/<if>/accept_redirects
         net/ipv4/conf/<if>/secure_redirects
         net/ipv4/conf/<if>/shared_media
         net/ipv4/conf/<if>/rp_filter
         net/ipv4/conf/<if>/send_redirects
         net/ipv4/conf/<if>/accept_source_route
         net/ipv4/conf/<if>/accept_local
         net/ipv4/conf/<if>/src_valid_mark
         net/ipv4/conf/<if>/proxy_arp
         net/ipv4/conf/<if>/medium_id
         net/ipv4/conf/<if>/bootp_relay
         net/ipv4/conf/<if>/log_martians
         net/ipv4/conf/<if>/tag
         net/ipv4/conf/<if>/arp_filter
         net/ipv4/conf/<if>/arp_announce
         net/ipv4/conf/<if>/arp_ignore
         net/ipv4/conf/<if>/arp_accept
         net/ipv4/conf/<if>/arp_notify
         net/ipv4/conf/<if>/proxy_arp_pvlan
         net/ipv4/conf/<if>/disable_xfrm
         net/ipv4/conf/<if>/disable_policy
         net/ipv4/conf/<if>/force_igmp_version
         net/ipv4/conf/<if>/promote_secondaries
         net/ipv4/conf/<if>/route_localnet
      
      - Enable the global ipv4 sysctl:
         net/ipv4/ip_forward
      
      - Enable the per device ipv6 sysctls:
         net/ipv6/conf/<if>/forwarding
         net/ipv6/conf/<if>/hop_limit
         net/ipv6/conf/<if>/mtu
         net/ipv6/conf/<if>/accept_ra
         net/ipv6/conf/<if>/accept_redirects
         net/ipv6/conf/<if>/autoconf
         net/ipv6/conf/<if>/dad_transmits
         net/ipv6/conf/<if>/router_solicitations
         net/ipv6/conf/<if>/router_solicitation_interval
         net/ipv6/conf/<if>/router_solicitation_delay
         net/ipv6/conf/<if>/force_mld_version
         net/ipv6/conf/<if>/use_tempaddr
         net/ipv6/conf/<if>/temp_valid_lft
         net/ipv6/conf/<if>/temp_prefered_lft
         net/ipv6/conf/<if>/regen_max_retry
         net/ipv6/conf/<if>/max_desync_factor
         net/ipv6/conf/<if>/max_addresses
         net/ipv6/conf/<if>/accept_ra_defrtr
         net/ipv6/conf/<if>/accept_ra_pinfo
         net/ipv6/conf/<if>/accept_ra_rtr_pref
         net/ipv6/conf/<if>/router_probe_interval
         net/ipv6/conf/<if>/accept_ra_rt_info_max_plen
         net/ipv6/conf/<if>/proxy_ndp
         net/ipv6/conf/<if>/accept_source_route
         net/ipv6/conf/<if>/optimistic_dad
         net/ipv6/conf/<if>/mc_forwarding
         net/ipv6/conf/<if>/disable_ipv6
         net/ipv6/conf/<if>/accept_dad
         net/ipv6/conf/<if>/force_tllao
      
      - Enable the global ipv6 sysctls:
         net/ipv6/bindv6only
         net/ipv6/icmp/ratelimit
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c027aab4
    • E
      net: Allow userns root to control ipv6 · af31f412
      Eric W. Biederman 提交于
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Settings that merely control a single network device are allowed.
      Either the network device is a logical network device where
      restrictions make no difference or the network device is hardware NIC
      that has been explicity moved from the initial network namespace.
      
      In general policy and network stack state changes are allowed while
      resource control is left unchanged.
      
      Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
      Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
      Allow the SIOCADDRT ioctl to add ipv6 routes.
      Allow the SIOCDELRT ioctl to delete ipv6 routes.
      
      Allow creation of ipv6 raw sockets.
      
      Allow setting the IPV6_JOIN_ANYCAST socket option.
      Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
      socket option.
      
      Allow setting the IPV6_TRANSPARENT socket option.
      Allow setting the IPV6_HOPOPTS socket option.
      Allow setting the IPV6_RTHDRDSTOPTS socket option.
      Allow setting the IPV6_DSTOPTS socket option.
      Allow setting the IPV6_IPSEC_POLICY socket option.
      Allow setting the IPV6_XFRM_POLICY socket option.
      
      Allow sending packets with the IPV6_2292HOPOPTS control message.
      Allow sending packets with the IPV6_2292DSTOPTS control message.
      Allow sending packets with the IPV6_RTHDRDSTOPTS control message.
      
      Allow setting the multicast routing socket options on non multicast
      routing sockets.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
      setting up, changing and deleting tunnels over ipv6.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
      setting up, changing and deleting ipv6 over ipv4 tunnels.
      
      Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
      deleting, and changing the potential router list for ISATAP tunnels.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af31f412
    • E
      net: Push capable(CAP_NET_ADMIN) into the rtnl methods · dfc47ef8
      Eric W. Biederman 提交于
      - In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check
        to ns_capable(net->user-ns, CAP_NET_ADMIN).  Allowing unprivileged
        users to make netlink calls to modify their local network
        namespace.
      
      - In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so
        that calls that are not safe for unprivileged users are still
        protected.
      
      Later patches will remove the extra capable calls from methods
      that are safe for unprivilged users.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfc47ef8
    • E
      net: Don't export sysctls to unprivileged users · 464dc801
      Eric W. Biederman 提交于
      In preparation for supporting the creation of network namespaces
      by unprivileged users, modify all of the per net sysctl exports
      and refuse to allow them to unprivileged users.
      
      This makes it safe for unprivileged users in general to access
      per net sysctls, and allows sysctls to be exported to unprivileged
      users on an individual basis as they are deemed safe.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      464dc801
  20. 18 11月, 2012 1 次提交
  21. 16 11月, 2012 1 次提交