1. 07 9月, 2017 4 次提交
  2. 06 9月, 2017 12 次提交
  3. 05 9月, 2017 8 次提交
  4. 04 9月, 2017 15 次提交
    • P
      netfilter: nf_tables: support for recursive chain deletion · 9dee1474
      Pablo Neira Ayuso 提交于
      This patch sorts out an asymmetry in deletions. Currently, table and set
      deletion commands come with an implicit content flush on deletion.
      However, chain deletion results in -EBUSY if there is content in this
      chain, so no implicit flush happens. So you have to send a flush command
      in first place to delete chains, this is inconsistent and it can be
      annoying in terms of user experience.
      
      This patch uses the new NLM_F_NONREC flag to request non-recursive chain
      deletion, ie. if the chain to be removed contains rules, then this
      returns EBUSY. This problem was discussed during the NFWS'17 in Faro,
      Portugal. In iptables, you hit -EBUSY if you try to delete a chain that
      contains rules, so you have to flush first before you can remove
      anything. Since iptables-compat uses the nf_tables netlink interface, it
      has to use the NLM_F_NONREC flag from userspace to retain the original
      iptables semantics, ie.  bail out on removing chains that contain rules.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9dee1474
    • P
      netfilter: nf_tables: use NLM_F_NONREC for deletion requests · a8278400
      Pablo Neira Ayuso 提交于
      Bail out if user requests non-recursive deletion for tables and sets.
      This new flags tells nf_tables netlink interface to reject deletions if
      tables and sets have content.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a8278400
    • P
      netfilter: nf_tables: add nf_tables_addchain() · 4035285f
      Pablo Neira Ayuso 提交于
      Wrap the chain addition path in a function to make it more maintainable.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4035285f
    • P
      netfilter: nf_tables: add nf_tables_updchain() · 2c4a488a
      Pablo Neira Ayuso 提交于
      nf_tables_newchain() is too large, wrap the chain update path in a
      function to make it more maintainable.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2c4a488a
    • V
      net: Remove CONFIG_NETFILTER_DEBUG and _ASSERT() macros. · 9efdb14f
      Varsha Rao 提交于
      This patch removes CONFIG_NETFILTER_DEBUG and _ASSERT() macros as they
      are no longer required. Replace _ASSERT() macros with WARN_ON().
      Signed-off-by: NVarsha Rao <rvarsha016@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9efdb14f
    • V
      net: Replace NF_CT_ASSERT() with WARN_ON(). · 44d6e2f2
      Varsha Rao 提交于
      This patch removes NF_CT_ASSERT() and instead uses WARN_ON().
      Signed-off-by: NVarsha Rao <rvarsha016@gmail.com>
      44d6e2f2
    • F
      netfilter: remove unused hooknum arg from packet functions · d1c1e39d
      Florian Westphal 提交于
      tested with allmodconfig build.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      d1c1e39d
    • P
      netfilter: nft_limit: add stateful object type · a6912055
      Pablo M. Bermudo Garay 提交于
      Register a new limit stateful object type into the stateful object
      infrastructure.
      Signed-off-by: NPablo M. Bermudo Garay <pablombg@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a6912055
    • P
      netfilter: nft_limit: replace pkt_bytes with bytes · 6e323887
      Pablo M. Bermudo Garay 提交于
      Just a small refactor patch in order to improve the code readability.
      Signed-off-by: NPablo M. Bermudo Garay <pablombg@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6e323887
    • P
      netfilter: nf_tables: add select_ops for stateful objects · dfc46034
      Pablo M. Bermudo Garay 提交于
      This patch adds support for overloading stateful objects operations
      through the select_ops() callback, just as it is implemented for
      expressions.
      
      This change is needed for upcoming additions to the stateful objects
      infrastructure.
      Signed-off-by: NPablo M. Bermudo Garay <pablombg@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      dfc46034
    • V
      netfilter: xt_hashlimit: add rate match mode · bea74641
      Vishwanath Pai 提交于
      This patch adds a new feature to hashlimit that allows matching on the
      current packet/byte rate without rate limiting. This can be enabled
      with a new flag --hashlimit-rate-match. The match returns true if the
      current rate of packets is above/below the user specified value.
      
      The main difference between the existing algorithm and the new one is
      that the existing algorithm rate-limits the flow whereas the new
      algorithm does not. Instead it *classifies* the flow based on whether
      it is above or below a certain rate. I will demonstrate this with an
      example below. Let us assume this rule:
      
      iptables -A INPUT -m hashlimit --hashlimit-above 10/s -j new_chain
      
      If the packet rate is 15/s, the existing algorithm would ACCEPT 10
      packets every second and send 5 packets to "new_chain".
      
      But with the new algorithm, as long as the rate of 15/s is sustained,
      all packets will continue to match and every packet is sent to new_chain.
      
      This new functionality will let us classify different flows based on
      their current rate, so that further decisions can be made on them based on
      what the current rate is.
      
      This is how the new algorithm works:
      We divide time into intervals of 1 (sec/min/hour) as specified by
      the user. We keep track of the number of packets/bytes processed in the
      current interval. After each interval we reset the counter to 0.
      
      When we receive a packet for match, we look at the packet rate
      during the current interval and the previous interval to make a
      decision:
      
      if [ prev_rate < user and cur_rate < user ]
              return Below
      else
              return Above
      
      Where cur_rate is the number of packets/bytes seen in the current
      interval, prev is the number of packets/bytes seen in the previous
      interval and 'user' is the rate specified by the user.
      
      We also provide flexibility to the user for choosing the time
      interval using the option --hashilmit-interval. For example the user can
      keep a low rate like x/hour but still keep the interval as small as 1
      second.
      
      To preserve backwards compatibility we have to add this feature in a new
      revision, so I've created revision 3 for hashlimit. The two new options
      we add are:
      
      --hashlimit-rate-match
      --hashlimit-rate-interval
      
      I have updated the help text to add these new options. Also added a few
      tests for the new options.
      Suggested-by: NIgor Lubashev <ilubashe@akamai.com>
      Reviewed-by: NJosh Hunt <johunt@akamai.com>
      Signed-off-by: NVishwanath Pai <vpai@akamai.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      bea74641
    • G
      l2tp: pass tunnel pointer to ->session_create() · f026bc29
      Guillaume Nault 提交于
      Using l2tp_tunnel_find() in pppol2tp_session_create() and
      l2tp_eth_create() is racy, because no reference is held on the
      returned session. These functions are only used to implement the
      ->session_create callback which is run by l2tp_nl_cmd_session_create().
      Therefore searching for the parent tunnel isn't necessary because
      l2tp_nl_cmd_session_create() already has a pointer to it and holds a
      reference.
      
      This patch modifies ->session_create()'s prototype to directly pass the
      the parent tunnel as parameter, thus avoiding searching for it in
      pppol2tp_session_create() and l2tp_eth_create().
      
      Since we have to touch the ->session_create() call in
      l2tp_nl_cmd_session_create(), let's also remove the useless conditional:
      we know that ->session_create isn't NULL at this point because it's
      already been checked earlier in this same function.
      
      Finally, one might be tempted to think that the removed
      l2tp_tunnel_find() calls were harmless because they would return the
      same tunnel as the one held by l2tp_nl_cmd_session_create() anyway.
      But that tunnel might be removed and a new one created with same tunnel
      Id before the l2tp_tunnel_find() call. In this case l2tp_tunnel_find()
      would return the new tunnel which wouldn't be protected by the
      reference held by l2tp_nl_cmd_session_create().
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Fixes: d9e31d17 ("l2tp: Add L2TP ethernet pseudowire support")
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f026bc29
    • G
      l2tp: prevent creation of sessions on terminated tunnels · f3c66d4e
      Guillaume Nault 提交于
      l2tp_tunnel_destruct() sets tunnel->sock to NULL, then removes the
      tunnel from the pernet list and finally closes all its sessions.
      Therefore, it's possible to add a session to a tunnel that is still
      reachable, but for which tunnel->sock has already been reset. This can
      make l2tp_session_create() dereference a NULL pointer when calling
      sock_hold(tunnel->sock).
      
      This patch adds the .acpt_newsess field to struct l2tp_tunnel, which is
      used by l2tp_tunnel_closeall() to prevent addition of new sessions to
      tunnels. Resetting tunnel->sock is done after l2tp_tunnel_closeall()
      returned, so that l2tp_session_add_to_tunnel() can safely take a
      reference on it when .acpt_newsess is true.
      
      The .acpt_newsess field is modified in l2tp_tunnel_closeall(), rather
      than in l2tp_tunnel_destruct(), so that it benefits all tunnel removal
      mechanisms. E.g. on UDP tunnels, a session could be added to a tunnel
      after l2tp_udp_encap_destroy() proceeded. This would prevent the tunnel
      from being removed because of the references held by this new session
      on the tunnel and its socket. Even though the session could be removed
      manually later on, this defeats the purpose of
      commit 9980d001 ("l2tp: add udp encap socket destroy handler").
      
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3c66d4e
    • J
      Revert "net: fix percpu memory leaks" · 5a63643e
      Jesper Dangaard Brouer 提交于
      This reverts commit 1d6119ba.
      
      After reverting commit 6d7b857d ("net: use lib/percpu_counter API
      for fragmentation mem accounting") then here is no need for this
      fix-up patch.  As percpu_counter is no longer used, it cannot
      memory leak it any-longer.
      
      Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting")
      Fixes: 1d6119ba ("net: fix percpu memory leaks")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a63643e
    • J
      Revert "net: use lib/percpu_counter API for fragmentation mem accounting" · fb452a1a
      Jesper Dangaard Brouer 提交于
      This reverts commit 6d7b857d.
      
      There is a bug in fragmentation codes use of the percpu_counter API,
      that can cause issues on systems with many CPUs.
      
      The frag_mem_limit() just reads the global counter (fbc->count),
      without considering other CPUs can have upto batch size (130K) that
      haven't been subtracted yet.  Due to the 3MBytes lower thresh limit,
      this become dangerous at >=24 CPUs (3*1024*1024/130000=24).
      
      The correct API usage would be to use __percpu_counter_compare() which
      does the right thing, and takes into account the number of (online)
      CPUs and batch size, to account for this and call __percpu_counter_sum()
      when needed.
      
      We choose to revert the use of the lib/percpu_counter API for frag
      memory accounting for several reasons:
      
      1) On systems with CPUs > 24, the heavier fully locked
         __percpu_counter_sum() is always invoked, which will be more
         expensive than the atomic_t that is reverted to.
      
      Given systems with more than 24 CPUs are becoming common this doesn't
      seem like a good option.  To mitigate this, the batch size could be
      decreased and thresh be increased.
      
      2) The add_frag_mem_limit+sub_frag_mem_limit pairs happen on the RX
         CPU, before SKBs are pushed into sockets on remote CPUs.  Given
         NICs can only hash on L2 part of the IP-header, the NIC-RXq's will
         likely be limited.  Thus, a fair chance that atomic add+dec happen
         on the same CPU.
      
      Revert note that commit 1d6119ba ("net: fix percpu memory leaks")
      removed init_frag_mem_limit() and instead use inet_frags_init_net().
      After this revert, inet_frags_uninit_net() becomes empty.
      
      Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting")
      Fixes: 1d6119ba ("net: fix percpu memory leaks")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb452a1a
  5. 02 9月, 2017 1 次提交