1. 27 6月, 2014 1 次提交
    • P
      netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c files · 83e96d44
      Pablo Neira Ayuso 提交于
      The plain text logging is currently embedded into the xt_LOG target.
      In order to be able to use the plain text logging from nft_log, as a
      first step, this patch moves the family specific code to the following
      files and Kconfig symbols:
      
      1) net/ipv4/netfilter/nf_log_ip.c: CONFIG_NF_LOG_IPV4
      2) net/ipv6/netfilter/nf_log_ip6.c: CONFIG_NF_LOG_IPV6
      3) net/netfilter/nf_log_common.c: CONFIG_NF_LOG_COMMON
      
      These new modules will be required by xt_LOG and nft_log. This patch
      is based on original patch from Arturo Borrero Gonzalez.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      83e96d44
  2. 26 6月, 2014 3 次提交
    • P
      netfilter: nf_log: move log buffering to core logging · 27fd8d90
      Pablo Neira Ayuso 提交于
      This patch moves Eric Dumazet's log buffer implementation from the
      xt_log.h header file to the core net/netfilter/nf_log.c. This also
      includes the renaming of the structure and functions to avoid possible
      undesired namespace clashes.
      
      This change allows us to use it from the arp and bridge packet logging
      implementation in follow up patches.
      27fd8d90
    • P
      netfilter: nf_log: use an array of loggers instead of list · 5962815a
      Pablo Neira Ayuso 提交于
      Now that legacy ulog targets are not available anymore in the tree, we
      can have up to two possible loggers:
      
      1) The plain text logging via kernel logging ring.
      2) The nfnetlink_log infrastructure which delivers log messages
         to userspace.
      
      This patch replaces the list of loggers by an array of two pointers
      per family for each possible logger and it also introduces a new field
      to the nf_logger structure which indicates the position in the logger
      array (based on the logger type).
      
      This prepares a follow up patch that consolidates the nf_log_packet()
      interface by allowing to specify the logger as parameter.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      5962815a
    • F
      netfilter: conntrack: remove timer from ecache extension · 9500507c
      Florian Westphal 提交于
      This brings the (per-conntrack) ecache extension back to 24 bytes in size
      (was 152 byte on x86_64 with lockdep on).
      
      When event delivery fails, re-delivery is attempted via work queue.
      
      Redelivery is attempted at least every 0.1 seconds, but can happen
      more frequently if userspace is not congested.
      
      The nf_ct_release_dying_list() function is removed.
      With this patch, ownership of the to-be-redelivered conntracks
      (on-dying-list-with-DYING-bit not yet set) is with the work queue,
      which will release the references once event is out.
      
      Joint work with Pablo Neira Ayuso.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9500507c
  3. 25 6月, 2014 2 次提交
  4. 08 6月, 2014 1 次提交
    • M
      net: netfilter: LLVMLinux: vlais-netfilter · 066c6807
      Mark Charlebois 提交于
      Replaced non-standard C use of Variable Length Arrays In Structs (VLAIS) in
      xt_repldata.h with a C99 compliant flexible array member and then calculated
      offsets to the other struct members. These other members aren't referenced by
      name in this code, however this patch maintains the same memory layout and
      padding as was previously accomplished using VLAIS.
      
      Had the original structure been ordered differently, with the entries VLA at
      the end, then it could have been a flexible member, and this patch would have
      been a lot simpler. However since the data stored in this structure is
      ultimately exported to userspace, the order of this structure can't be changed.
      
      This patch makes no attempt to change the existing behavior, merely the way in
      which the current layout is accomplished using standard C99 constructs. As such
      the code can now be compiled with either gcc or clang.
      
      This version of the patch removes the trailing alignment that the VLAIS
      structure would allocate in order to simplify the patch.
      
      Author: Mark Charlebois <charlebm@gmail.com>
      Signed-off-by: NMark Charlebois <charlebm@gmail.com>
      Signed-off-by: NBehan Webster <behanw@converseincode.com>
      Signed-off-by: NVinícius Tinti <viniciustinti@gmail.com>
      066c6807
  5. 05 6月, 2014 1 次提交
  6. 03 6月, 2014 1 次提交
    • E
      inetpeer: get rid of ip_id_count · 73f156a6
      Eric Dumazet 提交于
      Ideally, we would need to generate IP ID using a per destination IP
      generator.
      
      linux kernels used inet_peer cache for this purpose, but this had a huge
      cost on servers disabling MTU discovery.
      
      1) each inet_peer struct consumes 192 bytes
      
      2) inetpeer cache uses a binary tree of inet_peer structs,
         with a nominal size of ~66000 elements under load.
      
      3) lookups in this tree are hitting a lot of cache lines, as tree depth
         is about 20.
      
      4) If server deals with many tcp flows, we have a high probability of
         not finding the inet_peer, allocating a fresh one, inserting it in
         the tree with same initial ip_id_count, (cf secure_ip_id())
      
      5) We garbage collect inet_peer aggressively.
      
      IP ID generation do not have to be 'perfect'
      
      Goal is trying to avoid duplicates in a short period of time,
      so that reassembly units have a chance to complete reassembly of
      fragments belonging to one message before receiving other fragments
      with a recycled ID.
      
      We simply use an array of generators, and a Jenkin hash using the dst IP
      as a key.
      
      ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
      belongs (it is only used from this file)
      
      secure_ip_id() and secure_ipv6_id() no longer are needed.
      
      Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
      unnecessary decrement/increment of the number of segments.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73f156a6
  7. 02 6月, 2014 6 次提交
    • P
      netfilter: nf_tables: atomic allocation in set notifications from rcu callback · 31f8441c
      Pablo Neira Ayuso 提交于
      Use GFP_ATOMIC allocations when sending removal notifications of
      anonymous sets from rcu callback context. Sleeping in that context
      is illegal.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      31f8441c
    • P
      netfilter: nf_tables: allow to delete several objects from a batch · 4fefee57
      Pablo Neira Ayuso 提交于
      Three changes to allow the deletion of several objects with dependencies
      in one transaction, they are:
      
      1) Introduce speculative counter increment/decrement that is undone in
         the abort path if required, thus we avoid hitting -EBUSY when deleting
         the chain. The counter updates are reverted in the abort path.
      
      2) Increment/decrement table/chain use counter for each set/rule. We need
         this to fully rely on the use counters instead of the list content,
         eg. !list_empty(&chain->rules) which evaluate true in the middle of the
         transaction.
      
      3) Decrement table use counter when an anonymous set is bound to the
         rule in the commit path. This avoids hitting -EBUSY when deleting
         the table that contains anonymous sets. The anonymous sets are released
         in the nf_tables_rule_destroy path. This should not be a problem since
         the rule already bumped the use counter of the chain, so the bound
         anonymous set reflects dependencies through the rule object, which
         already increases the chain use counter.
      
      So the general assumption after this patch is that the use counters are
      bumped by direct object dependencies.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4fefee57
    • P
      netfilter: nft_rbtree: introduce locking · 7632667d
      Pablo Neira Ayuso 提交于
      There's no rbtree rcu version yet, so let's fall back on the spinlock
      to protect the concurrent access of this structure both from user
      (to update the set content) and kernel-space (in the packet path).
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7632667d
    • P
      netfilter: nf_tables: release objects in reverse order in the abort path · a1cee076
      Pablo Neira Ayuso 提交于
      The patch c7c32e72 ("netfilter: nf_tables: defer all object release via
      rcu") indicates that we always release deleted objects in the reverse
      order, but that is only needed in the abort path. These are the two
      possible scenarios when releasing objects:
      
      1) Deletion scenario in the commit path: no need to release objects in
      the reverse order since userspace already ensures that dependencies are
      fulfilled), ie. userspace tells us to delete rule -> ... -> rule ->
      chain -> table. In this case, we have to release the objects in the
      *same order* as userspace provided.
      
      2) Deletion scenario in the abort path: we have to iterate in the reverse
      order to undo what it cannot be added, ie. userspace sent us a batch
      that includes: table -> chain -> rule -> ... -> rule, and that needs to
      be partially undone. In this case, we have to release objects in the
      reverse order to ensure that the set and chain objects point to valid
      rule and table objects.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a1cee076
    • P
      netfilter: nf_tables: fix wrong transaction ordering in set elements · 46bbafce
      Pablo Neira Ayuso 提交于
      The transaction needs to be placed at the end of the commit list,
      otherwise event notifications are reordered and we may crash when
      releasing object via call_rcu.
      
      This problem was introduced in 60319eb1 ("netfilter: nf_tables: use new
      transaction infrastructure to handle elements").
      Reported-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      46bbafce
    • M
      netfilter: nfnetlink_acct: Fix memory leak · 4c552a64
      Mathieu Poirier 提交于
      Allocation of memory need only to happen once, that is
      after the proper checks on the NFACCT_FLAGS have been
      done.  Otherwise the code can return without freeing
      already allocated memory.
      Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4c552a64
  8. 26 5月, 2014 1 次提交
  9. 24 5月, 2014 1 次提交
    • D
      net: filter: let unattached filters use sock_fprog_kern · b1fcd35c
      Daniel Borkmann 提交于
      The sk_unattached_filter_create() API is used by BPF filters that
      are not directly attached or related to sockets, and are used in
      team, ptp, xt_bpf, cls_bpf, etc. As such all users do their own
      internal managment of obtaining filter blocks and thus already
      have them in kernel memory and set up before calling into
      sk_unattached_filter_create(). As a result, due to __user annotation
      in sock_fprog, sparse triggers false positives (incorrect type in
      assignment [different address space]) when filters are set up before
      passing them to sk_unattached_filter_create(). Therefore, let
      sk_unattached_filter_create() API use sock_fprog_kern to overcome
      this issue.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1fcd35c
  10. 19 5月, 2014 14 次提交
  11. 16 5月, 2014 1 次提交
    • P
      netfilter: nf_tables: fix trace of matching non-terminal rule · 3b084e99
      Pablo Neira Ayuso 提交于
      Add the corresponding trace if we have a full match in a non-terminal
      rule. Note that the traces will look slightly different than in
      x_tables since the log message after all expressions have been
      evaluated (contrary to x_tables, that emits it before the target
      action). This manifests in two differences in nf_tables wrt. x_tables:
      
      1) The rule that enables the tracing is included in the trace.
      
      2) If the rule emits some log message, that is shown before the
         trace log message.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3b084e99
  12. 13 5月, 2014 1 次提交
  13. 12 5月, 2014 4 次提交
  14. 11 5月, 2014 1 次提交
  15. 04 5月, 2014 1 次提交
    • D
      netfilter: nfnetlink: Fix use after free when it fails to process batch · ecd15dd7
      Denys Fedoryshchenko 提交于
      This bug manifests when calling the nft command line tool without
      nf_tables kernel support.
      
      kernel message:
      [   44.071555] Netfilter messages via NETLINK v0.30.
      [   44.072253] BUG: unable to handle kernel NULL pointer dereference at 0000000000000119
      [   44.072264] IP: [<ffffffff8171db1f>] netlink_getsockbyportid+0xf/0x70
      [   44.072272] PGD 7f2b74067 PUD 7f2b73067 PMD 0
      [   44.072277] Oops: 0000 [#1] SMP
      [...]
      [   44.072369] Call Trace:
      [   44.072373]  [<ffffffff8171fd81>] netlink_unicast+0x91/0x200
      [   44.072377]  [<ffffffff817206c9>] netlink_ack+0x99/0x110
      [   44.072381]  [<ffffffffa004b951>] nfnetlink_rcv+0x3c1/0x408 [nfnetlink]
      [   44.072385]  [<ffffffff8171fde3>] netlink_unicast+0xf3/0x200
      [   44.072389]  [<ffffffff817201ef>] netlink_sendmsg+0x2ff/0x740
      [   44.072394]  [<ffffffff81044752>] ? __mmdrop+0x62/0x90
      [   44.072398]  [<ffffffff816dafdb>] sock_sendmsg+0x8b/0xc0
      [   44.072403]  [<ffffffff812f1af5>] ? copy_user_enhanced_fast_string+0x5/0x10
      [   44.072406]  [<ffffffff816dbb6c>] ? move_addr_to_kernel+0x2c/0x50
      [   44.072410]  [<ffffffff816db423>] ___sys_sendmsg+0x3c3/0x3d0
      [   44.072415]  [<ffffffff811301ba>] ? handle_mm_fault+0xa9a/0xc60
      [   44.072420]  [<ffffffff811362d6>] ? mmap_region+0x166/0x5a0
      [   44.072424]  [<ffffffff817da84c>] ? __do_page_fault+0x1dc/0x510
      [   44.072428]  [<ffffffff812b8b2c>] ? apparmor_capable+0x1c/0x60
      [   44.072435]  [<ffffffff817d6e9a>] ? _raw_spin_unlock_bh+0x1a/0x20
      [   44.072439]  [<ffffffff816dfc86>] ? release_sock+0x106/0x150
      [   44.072443]  [<ffffffff816dc212>] __sys_sendmsg+0x42/0x80
      [   44.072446]  [<ffffffff816dc262>] SyS_sendmsg+0x12/0x20
      [   44.072450]  [<ffffffff817df616>] system_call_fastpath+0x1a/0x1f
      Signed-off-by: NDenys Fedoryshchenko <nuclearcat@nuclearcat.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ecd15dd7
  16. 30 4月, 2014 1 次提交