1. 01 4月, 2018 9 次提交
    • E
      ipv6: frags: get rid of ip6frag_skb_cb/FRAG6_CB · 219badfa
      Eric Dumazet 提交于
      ip6_frag_queue uses skb->cb[] to store the fragment offset, meaning that
      we could use two cache lines per skb when finding the insertion point,
      if for some reason inet6_skb_parm size is increased in the future.
      
      By using skb->ip_defrag_offset instead of skb->cb[], we pack all
      the fields in a single cache line, matching what we did for IPv4.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      219badfa
    • E
      ipv6: frags: rewrite ip6_expire_frag_queue() · 05c0b86b
      Eric Dumazet 提交于
      Make it similar to IPv4 ip_expire(), and release the lock
      before calling icmp functions.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05c0b86b
    • E
      inet: frags: break the 2GB limit for frags storage · 3e67f106
      Eric Dumazet 提交于
      Some users are willing to provision huge amounts of memory to be able
      to perform reassembly reasonnably well under pressure.
      
      Current memory tracking is using one atomic_t and integers.
      
      Switch to atomic_long_t so that 64bit arches can use more than 2GB,
      without any cost for 32bit arches.
      
      Note that this patch avoids an overflow error, if high_thresh was set
      to ~2GB, since this test in inet_frag_alloc() was never true :
      
      if (... || frag_mem_limit(nf) > nf->high_thresh)
      
      Tested:
      
      $ echo 16000000000 >/proc/sys/net/ipv4/ipfrag_high_thresh
      
      <frag DDOS>
      
      $ grep FRAG /proc/net/sockstat
      FRAG: inuse 14705885 memory 16000002880
      
      $ nstat -n ; sleep 1 ; nstat | grep Reas
      IpReasmReqds                    3317150            0.0
      IpReasmFails                    3317112            0.0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e67f106
    • E
      inet: frags: remove inet_frag_maybe_warn_overflow() · 2d44ed22
      Eric Dumazet 提交于
      This function is obsolete, after rhashtable addition to inet defrag.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d44ed22
    • E
      inet: frags: get rif of inet_frag_evicting() · 399d1404
      Eric Dumazet 提交于
      This refactors ip_expire() since one indentation level is removed.
      
      Note: in the future, we should try hard to avoid the skb_clone()
      since this is a serious performance cost.
      Under DDOS, the ICMP message wont be sent because of rate limits.
      
      Fact that ip6_expire_frag_queue() does not use skb_clone() is
      disturbing too. Presumably IPv6 should have the same
      issue than the one we fixed in commit ec4fbd64
      ("inet: frag: release spinlock before calling icmp_send()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      399d1404
    • E
      inet: frags: use rhashtables for reassembly units · 648700f7
      Eric Dumazet 提交于
      Some applications still rely on IP fragmentation, and to be fair linux
      reassembly unit is not working under any serious load.
      
      It uses static hash tables of 1024 buckets, and up to 128 items per bucket (!!!)
      
      A work queue is supposed to garbage collect items when host is under memory
      pressure, and doing a hash rebuild, changing seed used in hash computations.
      
      This work queue blocks softirqs for up to 25 ms when doing a hash rebuild,
      occurring every 5 seconds if host is under fire.
      
      Then there is the problem of sharing this hash table for all netns.
      
      It is time to switch to rhashtables, and allocate one of them per netns
      to speedup netns dismantle, since this is a critical metric these days.
      
      Lookup is now using RCU. A followup patch will even remove
      the refcount hold/release left from prior implementation and save
      a couple of atomic operations.
      
      Before this patch, 16 cpus (16 RX queue NIC) could not handle more
      than 1 Mpps frags DDOS.
      
      After the patch, I reach 9 Mpps without any tuning, and can use up to 2GB
      of storage for the fragments (exact number depends on frags being evicted
      after timeout)
      
      $ grep FRAG /proc/net/sockstat
      FRAG: inuse 1966916 memory 2140004608
      
      A followup patch will change the limits for 64bit arches.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Alexander Aring <alex.aring@gmail.com>
      Cc: Stefan Schmidt <stefan@osg.samsung.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      648700f7
    • E
      inet: frags: refactor ipv6_frag_init() · 5b975bab
      Eric Dumazet 提交于
      We want to call inet_frags_init() earlier.
      
      This is a prereq to "inet: frags: use rhashtables for reassembly units"
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b975bab
    • E
      inet: frags: add a pointer to struct netns_frags · 093ba729
      Eric Dumazet 提交于
      In order to simplify the API, add a pointer to struct inet_frags.
      This will allow us to make things less complex.
      
      These functions no longer have a struct inet_frags parameter :
      
      inet_frag_destroy(struct inet_frag_queue *q  /*, struct inet_frags *f */)
      inet_frag_put(struct inet_frag_queue *q /*, struct inet_frags *f */)
      inet_frag_kill(struct inet_frag_queue *q /*, struct inet_frags *f */)
      inet_frags_exit_net(struct netns_frags *nf /*, struct inet_frags *f */)
      ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      093ba729
    • E
      inet: frags: change inet_frags_init_net() return value · 787bea77
      Eric Dumazet 提交于
      We will soon initialize one rhashtable per struct netns_frags
      in inet_frags_init_net().
      
      This patch changes the return value to eventually propagate an
      error.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      787bea77
  2. 30 3月, 2018 1 次提交
  3. 28 3月, 2018 1 次提交
  4. 20 2月, 2018 1 次提交
    • K
      net: Convert ip6_frags_ops · 5fc094f5
      Kirill Tkhai 提交于
      Exit methods calls inet_frags_exit_net() with global ip6_frags
      as argument. So, after we make the pernet_operations async,
      a pair of exit methods may be called to iterate this hash table.
      Since there is inet_frag_worker(), which already may work
      in parallel with inet_frags_exit_net(), and it can make the same
      cleanup, that inet_frags_exit_net() does, it's safe. So we may
      mark these pernet_operations as async.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fc094f5
  5. 18 10月, 2017 1 次提交
    • K
      inet: frags: Convert timers to use timer_setup() · 78802011
      Kees Cook 提交于
      In preparation for unconditionally passing the struct timer_list pointer to
      all timer callbacks, switch to using the new timer_setup() and from_timer()
      to pass the timer pointer explicitly.
      
      Cc: Alexander Aring <alex.aring@gmail.com>
      Cc: Stefan Schmidt <stefan@osg.samsung.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: linux-wpan@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: netfilter-devel@vger.kernel.org
      Cc: coreteam@netfilter.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: Stefan Schmidt <stefan@osg.samsung.com> # for ieee802154
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78802011
  6. 04 9月, 2017 1 次提交
  7. 04 11月, 2016 1 次提交
  8. 24 10月, 2016 1 次提交
  9. 28 4月, 2016 1 次提交
  10. 20 2月, 2016 1 次提交
  11. 06 1月, 2016 1 次提交
  12. 25 11月, 2015 1 次提交
  13. 03 11月, 2015 1 次提交
  14. 27 7月, 2015 2 次提交
  15. 01 4月, 2015 2 次提交
  16. 24 11月, 2014 1 次提交
  17. 31 10月, 2014 1 次提交
  18. 25 8月, 2014 2 次提交
  19. 03 8月, 2014 4 次提交
  20. 28 7月, 2014 7 次提交