1. 02 3月, 2016 1 次提交
  2. 17 2月, 2016 2 次提交
  3. 13 10月, 2015 1 次提交
  4. 08 10月, 2015 6 次提交
  5. 05 10月, 2015 1 次提交
  6. 30 9月, 2015 1 次提交
  7. 26 9月, 2015 2 次提交
  8. 31 8月, 2015 1 次提交
  9. 29 8月, 2015 1 次提交
  10. 30 7月, 2015 1 次提交
  11. 16 7月, 2015 1 次提交
  12. 05 6月, 2015 2 次提交
    • T
      net: Add full IPv6 addresses to flow_keys · c3f83241
      Tom Herbert 提交于
      This patch adds full IPv6 addresses into flow_keys and uses them as
      input to the flow hash function. The implementation supports either
      IPv4 or IPv6 addresses in a union, and selector is used to determine
      how may words to input to jhash2.
      
      We also add flow_get_u32_dst and flow_get_u32_src functions which are
      used to get a u32 representation of the source and destination
      addresses. For IPv6, ipv6_addr_hash is called. These functions retain
      getting the legacy values of src and dst in flow_keys.
      
      With this patch, Ethertype and IP protocol are now included in the
      flow hash input.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3f83241
    • T
      net: Get skb hash over flow_keys structure · 42aecaa9
      Tom Herbert 提交于
      This patch changes flow hashing to use jhash2 over the flow_keys
      structure instead just doing jhash_3words over src, dst, and ports.
      This method will allow us take more input into the hashing function
      so that we can include full IPv6 addresses, VLAN, flow labels etc.
      without needing to resort to xor'ing which makes for a poor hash.
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42aecaa9
  13. 28 5月, 2015 1 次提交
    • F
      ip_fragment: don't forward defragmented DF packet · d6b915e2
      Florian Westphal 提交于
      We currently always send fragments without DF bit set.
      
      Thus, given following setup:
      
      mtu1500 - mtu1500:1400 - mtu1400:1280 - mtu1280
         A           R1              R2         B
      
      Where R1 and R2 run linux with netfilter defragmentation/conntrack
      enabled, then if Host A sent a fragmented packet _with_ DF set to B, R1
      will respond with icmp too big error if one of these fragments exceeded
      1400 bytes.
      
      However, if R1 receives fragment sizes 1200 and 100, it would
      forward the reassembled packet without refragmenting, i.e.
      R2 will send an icmp error in response to a packet that was never sent,
      citing mtu that the original sender never exceeded.
      
      The other minor issue is that a refragmentation on R1 will conceal the
      MTU of R2-B since refragmentation does not set DF bit on the fragments.
      
      This modifies ip_fragment so that we track largest fragment size seen
      both for DF and non-DF packets, and set frag_max_size to the largest
      value.
      
      If the DF fragment size is larger or equal to the non-df one, we will
      consider the packet a path mtu probe:
      We set DF bit on the reassembled skb and also tag it with a new IPCB flag
      to force refragmentation even if skb fits outdev mtu.
      
      We will also set DF bit on each fragment in this case.
      
      Joint work with Hannes Frederic Sowa.
      Reported-by: NJesse Gross <jesse@nicira.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6b915e2
  14. 20 5月, 2015 1 次提交
  15. 19 5月, 2015 2 次提交
  16. 14 5月, 2015 3 次提交
  17. 08 4月, 2015 1 次提交
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd
  18. 07 4月, 2015 1 次提交
    • H
      ipv6: protect skb->sk accesses from recursive dereference inside the stack · f60e5990
      hannes@stressinduktion.org 提交于
      We should not consult skb->sk for output decisions in xmit recursion
      levels > 0 in the stack. Otherwise local socket settings could influence
      the result of e.g. tunnel encapsulation process.
      
      ipv6 does not conform with this in three places:
      
      1) ip6_fragment: we do consult ipv6_npinfo for frag_size
      
      2) sk_mc_loop in ipv6 uses skb->sk and checks if we should
         loop the packet back to the local socket
      
      3) ip6_skb_dst_mtu could query the settings from the user socket and
         force a wrong MTU
      
      Furthermore:
      In sk_mc_loop we could potentially land in WARN_ON(1) if we use a
      PF_PACKET socket ontop of an IPv6-backed vxlan device.
      
      Reuse xmit_recursion as we are currently only interested in protecting
      tunnel devices.
      
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f60e5990
  19. 26 3月, 2015 1 次提交
  20. 02 2月, 2015 1 次提交
    • E
      ipv4: tcp: get rid of ugly unicast_sock · bdbbb852
      Eric Dumazet 提交于
      In commit be9f4a44 ("ipv4: tcp: remove per net tcp_sock")
      I tried to address contention on a socket lock, but the solution
      I chose was horrible :
      
      commit 3a7c384f ("ipv4: tcp: unicast_sock should not land outside
      of TCP stack") addressed a selinux regression.
      
      commit 0980e56e ("ipv4: tcp: set unicast_sock uc_ttl to -1")
      took care of another regression.
      
      commit b5ec8eea ("ipv4: fix ip_send_skb()") fixed another regression.
      
      commit 811230cd ("tcp: ipv4: initialize unicast_sock sk_pacing_rate")
      was another shot in the dark.
      
      Really, just use a proper socket per cpu, and remove the skb_orphan()
      call, to re-enable flow control.
      
      This solves a serious problem with FQ packet scheduler when used in
      hostile environments, as we do not want to allocate a flow structure
      for every RST packet sent in response to a spoofed packet.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdbbb852
  21. 27 1月, 2015 1 次提交
  22. 06 1月, 2015 1 次提交
  23. 29 9月, 2014 1 次提交
  24. 24 9月, 2014 1 次提交
    • E
      icmp: add a global rate limitation · 4cdf507d
      Eric Dumazet 提交于
      Current ICMP rate limiting uses inetpeer cache, which is an RBL tree
      protected by a lock, meaning that hosts can be stuck hard if all cpus
      want to check ICMP limits.
      
      When say a DNS or NTP server process is restarted, inetpeer tree grows
      quick and machine comes to its knees.
      
      iptables can not help because the bottleneck happens before ICMP
      messages are even cooked and sent.
      
      This patch adds a new global limitation, using a token bucket filter,
      controlled by two new sysctl :
      
      icmp_msgs_per_sec - INTEGER
          Limit maximal number of ICMP packets sent per second from this host.
          Only messages whose type matches icmp_ratemask are
          controlled by this limit.
          Default: 1000
      
      icmp_msgs_burst - INTEGER
          icmp_msgs_per_sec controls number of ICMP packets sent per second,
          while icmp_msgs_burst controls the burst size of these packets.
          Default: 50
      
      Note that if we really want to send millions of ICMP messages per
      second, we might extend idea and infra added in commit 04ca6973
      ("ip: make IP identifiers less predictable") :
      add a token bucket in the ip_idents hash and no longer rely on inetpeer.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cdf507d
  25. 10 9月, 2014 1 次提交
  26. 25 8月, 2014 1 次提交
  27. 30 7月, 2014 1 次提交
  28. 29 7月, 2014 1 次提交
    • E
      ip: make IP identifiers less predictable · 04ca6973
      Eric Dumazet 提交于
      In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
      Jedidiah describe ways exploiting linux IP identifier generation to
      infer whether two machines are exchanging packets.
      
      With commit 73f156a6 ("inetpeer: get rid of ip_id_count"), we
      changed IP id generation, but this does not really prevent this
      side-channel technique.
      
      This patch adds a random amount of perturbation so that IP identifiers
      for a given destination [1] are no longer monotonically increasing after
      an idle period.
      
      Note that prandom_u32_max(1) returns 0, so if generator is used at most
      once per jiffy, this patch inserts no hole in the ID suite and do not
      increase collision probability.
      
      This is jiffies based, so in the worst case (HZ=1000), the id can
      rollover after ~65 seconds of idle time, which should be fine.
      
      We also change the hash used in __ip_select_ident() to not only hash
      on daddr, but also saddr and protocol, so that ICMP probes can not be
      used to infer information for other protocols.
      
      For IPv6, adds saddr into the hash as well, but not nexthdr.
      
      If I ping the patched target, we can see ID are now hard to predict.
      
      21:57:11.008086 IP (...)
          A > target: ICMP echo request, seq 1, length 64
      21:57:11.010752 IP (... id 2081 ...)
          target > A: ICMP echo reply, seq 1, length 64
      
      21:57:12.013133 IP (...)
          A > target: ICMP echo request, seq 2, length 64
      21:57:12.015737 IP (... id 3039 ...)
          target > A: ICMP echo reply, seq 2, length 64
      
      21:57:13.016580 IP (...)
          A > target: ICMP echo request, seq 3, length 64
      21:57:13.019251 IP (... id 3437 ...)
          target > A: ICMP echo reply, seq 3, length 64
      
      [1] TCP sessions uses a per flow ID generator not changed by this patch.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJeffrey Knockel <jeffk@cs.unm.edu>
      Reported-by: NJedidiah R. Crandall <crandall@cs.unm.edu>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Hannes Frederic Sowa <hannes@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04ca6973
  29. 28 7月, 2014 1 次提交