1. 09 1月, 2018 10 次提交
    • P
      netfilter: move reroute indirection to struct nf_ipv6_ops · ce388f45
      Pablo Neira Ayuso 提交于
      We cannot make a direct call to nf_ip6_reroute() because that would result
      in autoloading the 'ipv6' module because of symbol dependencies.
      Therefore, define reroute indirection in nf_ipv6_ops where this really
      belongs to.
      
      For IPv4, we can indeed make a direct function call, which is faster,
      given IPv4 is built-in in the networking code by default. Still,
      CONFIG_INET=n and CONFIG_NETFILTER=y is possible, so define empty inline
      stub for IPv4 in such case.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ce388f45
    • P
      netfilter: move route indirection to struct nf_ipv6_ops · 3f87c08c
      Pablo Neira Ayuso 提交于
      We cannot make a direct call to nf_ip6_route() because that would result
      in autoloading the 'ipv6' module because of symbol dependencies.
      Therefore, define route indirection in nf_ipv6_ops where this really
      belongs to.
      
      For IPv4, we can indeed make a direct function call, which is faster,
      given IPv4 is built-in in the networking code by default. Still,
      CONFIG_INET=n and CONFIG_NETFILTER=y is possible, so define empty inline
      stub for IPv4 in such case.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3f87c08c
    • P
      netfilter: remove saveroute indirection in struct nf_afinfo · 7db9a51e
      Pablo Neira Ayuso 提交于
      This is only used by nf_queue.c and this function comes with no symbol
      dependencies with IPv6, it just refers to structure layouts. Therefore,
      we can replace it by a direct function call from where it belongs.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7db9a51e
    • P
      netfilter: move checksum_partial indirection to struct nf_ipv6_ops · f7dcbe2f
      Pablo Neira Ayuso 提交于
      We cannot make a direct call to nf_ip6_checksum_partial() because that
      would result in autoloading the 'ipv6' module because of symbol
      dependencies.  Therefore, define checksum_partial indirection in
      nf_ipv6_ops where this really belongs to.
      
      For IPv4, we can indeed make a direct function call, which is faster,
      given IPv4 is built-in in the networking code by default. Still,
      CONFIG_INET=n and CONFIG_NETFILTER=y is possible, so define empty inline
      stub for IPv4 in such case.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f7dcbe2f
    • P
      netfilter: move checksum indirection to struct nf_ipv6_ops · ef71fe27
      Pablo Neira Ayuso 提交于
      We cannot make a direct call to nf_ip6_checksum() because that would
      result in autoloading the 'ipv6' module because of symbol dependencies.
      Therefore, define checksum indirection in nf_ipv6_ops where this really
      belongs to.
      
      For IPv4, we can indeed make a direct function call, which is faster,
      given IPv4 is built-in in the networking code by default. Still,
      CONFIG_INET=n and CONFIG_NETFILTER=y is possible, so define empty inline
      stub for IPv4 in such case.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ef71fe27
    • F
      netfilter: core: only allow one nat hook per hook point · f92b40a8
      Florian Westphal 提交于
      The netfilter NAT core cannot deal with more than one NAT hook per hook
      location (prerouting, input ...), because the NAT hooks install a NAT null
      binding in case the iptables nat table (iptable_nat hooks) or the
      corresponding nftables chain (nft nat hooks) doesn't specify a nat
      transformation.
      
      Null bindings are needed to detect port collsisions between NAT-ed and
      non-NAT-ed connections.
      
      This causes nftables NAT rules to not work when iptable_nat module is
      loaded, and vice versa because nat binding has already been attached
      when the second nat hook is consulted.
      
      The netfilter core is not really the correct location to handle this
      (hooks are just hooks, the core has no notion of what kinds of side
       effects a hook implements), but its the only place where we can check
      for conflicts between both iptables hooks and nftables hooks without
      adding dependencies.
      
      So add nat annotation to hook_ops to describe those hooks that will
      add NAT bindings and then make core reject if such a hook already exists.
      The annotation fills a padding hole, in case further restrictions appar
      we might change this to a 'u8 type' instead of bool.
      
      iptables error if nft nat hook active:
      iptables -t nat -A POSTROUTING -j MASQUERADE
      iptables v1.4.21: can't initialize iptables table `nat': File exists
      Perhaps iptables or your kernel needs to be upgraded.
      
      nftables error if iptables nat table present:
      nft -f /etc/nftables/ipv4-nat
      /usr/etc/nftables/ipv4-nat:3:1-2: Error: Could not process rule: File exists
      table nat {
      ^^
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f92b40a8
    • F
      netfilter: don't allocate space for arp/bridge hooks unless needed · 2a95183a
      Florian Westphal 提交于
      no need to define hook points if the family isn't supported.
      Because we need these hooks for either nftables, arp/ebtables
      or the 'call-iptables' hack we have in the bridge layer add two
      new dependencies, NETFILTER_FAMILY_{ARP,BRIDGE}, and have the
      users select them.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2a95183a
    • F
      netfilter: don't allocate space for decnet hooks unless needed · bb4badf3
      Florian Westphal 提交于
      no need to define hook points if the family isn't supported.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      bb4badf3
    • F
      netfilter: reduce size of hook entry point locations · b0f38338
      Florian Westphal 提交于
      struct net contains:
      
      struct nf_hook_entries __rcu *hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
      
      which store the hook entry point locations for the various protocol
      families and the hooks.
      
      Using array results in compact c code when doing accesses, i.e.
        x = rcu_dereference(net->nf.hooks[pf][hook]);
      
      but its also wasting a lot of memory, as most families are
      not used.
      
      So split the array into those families that are used, which
      are only 5 (instead of 13).  In most cases, the 'pf' argument is
      constant, i.e. gcc removes switch statement.
      
      struct net before:
       /* size: 5184, cachelines: 81, members: 46 */
      after:
       /* size: 4672, cachelines: 73, members: 46 */
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b0f38338
    • F
      netfilter: core: free hooks with call_rcu · 8c873e21
      Florian Westphal 提交于
      Giuseppe Scrivano says:
        "SELinux, if enabled, registers for each new network namespace 6
          netfilter hooks."
      
      Cost for this is high.  With synchronize_net() removed:
         "The net benefit on an SMP machine with two cores is that creating a
         new network namespace takes -40% of the original time."
      
      This patch replaces synchronize_net+kvfree with call_rcu().
      We store rcu_head at the tail of a structure that has no fixed layout,
      i.e. we cannot use offsetof() to compute the start of the original
      allocation.  Thus store this information right after the rcu head.
      
      We could simplify this by just placing the rcu_head at the start
      of struct nf_hook_entries.  However, this structure is used in
      packet processing hotpath, so only place what is needed for that
      at the beginning of the struct.
      Reported-by: NGiuseppe Scrivano <gscrivan@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8c873e21
  2. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  3. 28 8月, 2017 1 次提交
    • A
      netfilter: convert hook list to an array · 960632ec
      Aaron Conole 提交于
      This converts the storage and layout of netfilter hook entries from a
      linked list to an array.  After this commit, hook entries will be
      stored adjacent in memory.  The next pointer is no longer required.
      
      The ops pointers are stored at the end of the array as they are only
      used in the register/unregister path and in the legacy br_netfilter code.
      
      nf_unregister_net_hooks() is slower than needed as it just calls
      nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
      calls), this will be addressed in followup patch.
      
      Test setup:
       - ixgbe 10gbit
       - netperf UDP_STREAM, 64 byte packets
       - 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
      empty mangle and raw prerouting, mangle and filter input hooks:
      353.9
      this patch:
      364.2
      Signed-off-by: NAaron Conole <aconole@bytheb.org>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      960632ec
  4. 17 7月, 2017 1 次提交
  5. 07 12月, 2016 2 次提交
  6. 03 11月, 2016 2 次提交
  7. 25 9月, 2016 2 次提交
  8. 19 9月, 2016 1 次提交
  9. 03 3月, 2016 1 次提交
  10. 17 10月, 2015 2 次提交
    • A
      netfilter: turn NF_HOOK into an inline function · 008027c3
      Arnd Bergmann 提交于
      A recent change to the dst_output handling caused a new warning
      when the call to NF_HOOK() is the only used of a local variable
      passed as 'dev', and CONFIG_NETFILTER is disabled:
      
      net/ipv6/ip6_output.c: In function 'ip6_output':
      net/ipv6/ip6_output.c:135:21: warning: unused variable 'dev' [-Wunused-variable]
      
      The reason for this is that the NF_HOOK macro in this case does
      not reference the variable at all, and the call to dev_net(dev)
      got removed from the ip6_output function. To avoid that warning now
      and in the future, this changes the macro into an equivalent
      inline function, which tells the compiler that the variable is
      passed correctly but still unused.
      
      The dn_forward function apparently had the same problem in
      the past and added a local workaround that no longer works
      with the inline function. In order to avoid a regression, we
      have to also remove the #ifdef from decnet in the same patch.
      
      Fixes: ede2059d ("dst: Pass net into dst->output")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      008027c3
    • F
      netfilter: remove hook owner refcounting · 2ffbceb2
      Florian Westphal 提交于
      since commit 8405a8ff ("netfilter: nf_qeueue: Drop queue entries on
      nf_unregister_hook") all pending queued entries are discarded.
      
      So we can simply remove all of the owner handling -- when module is
      removed it also needs to unregister all its hooks.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2ffbceb2
  11. 05 10月, 2015 3 次提交
  12. 30 9月, 2015 1 次提交
  13. 19 9月, 2015 1 次提交
  14. 18 9月, 2015 5 次提交
  15. 03 9月, 2015 1 次提交
    • D
      netfilter: nf_conntrack: make nf_ct_zone_dflt built-in · 62da9865
      Daniel Borkmann 提交于
      Fengguang reported, that some randconfig generated the following linker
      issue with nf_ct_zone_dflt object involved:
      
        [...]
        CC      init/version.o
        LD      init/built-in.o
        net/built-in.o: In function `ipv4_conntrack_defrag':
        nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
        net/built-in.o: In function `ipv6_defrag':
        nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
        make: *** [vmlinux] Error 1
      
      Given that configurations exist where we have a built-in part, which is
      accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
      and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
      module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
      area when netfilter is configured in general.
      
      Therefore, split the more generic parts into a common header under
      include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
      section that already holds parts related to CONFIG_NF_CONNTRACK in the
      netfilter core. This fixes the issue on my side.
      
      Fixes: 308ac914 ("netfilter: nf_conntrack: push zone object into functions")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62da9865
  16. 23 7月, 2015 1 次提交
  17. 16 7月, 2015 2 次提交
    • F
      netfilter: move tee_active to core · e7c8899f
      Florian Westphal 提交于
      This prepares for a TEE like expression in nftables.
      We want to ensure only one duplicate is sent, so both will
      use the same percpu variable to detect duplication.
      
      The other use case is detection of recursive call to xtables, but since
      we don't want dependency from nft to xtables core its put into core.c
      instead of the x_tables core.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      e7c8899f
    • E
      netfilter: Per network namespace netfilter hooks. · 085db2c0
      Eric W. Biederman 提交于
      - Add a new set of functions for registering and unregistering per
        network namespace hooks.
      
      - Modify the old global namespace hook functions to use the per
        network namespace hooks in their implementation, so their remains a
        single list that needs to be walked for any hook (this is important
        for keeping the hook priority working and for keeping the code
        walking the hooks simple).
      
      - Only allow registering the per netdevice hooks in the network
        namespace where the network device lives.
      
      - Dynamically allocate the structures in the per network namespace
        hook list in nf_register_net_hook, and unregister them in
        nf_unregister_net_hook.
      
        Dynamic allocate is required somewhere as the number of network
        namespaces are not fixed so we might as well allocate them in the
        registration function.
      
        The chain of registered hooks on any list is expected to be small so
        the cost of walking that list to find the entry we are unregistering
        should also be small.
      
        Performing the management of the dynamically allocated list entries
        in the registration and unregistration functions keeps the complexity
        from spreading.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      085db2c0
  18. 15 7月, 2015 1 次提交
  19. 19 6月, 2015 1 次提交
    • P
      netfilter: don't pull include/linux/netfilter.h from netns headers · a263653e
      Pablo Neira Ayuso 提交于
      This pulls the full hook netfilter definitions from all those that include
      net_namespace.h.
      
      Instead let's just include the bare minimum required in the new
      linux/netfilter_defs.h file, and use it from the netfilter netns header files.
      
      I also needed to include in.h and in6.h from linux/netfilter.h otherwise we hit
      this compilation error:
      
      In file included from include/linux/netfilter_defs.h:4:0,
                       from include/net/netns/netfilter.h:4,
                       from include/net/net_namespace.h:22,
                       from include/linux/netdevice.h:43,
                       from net/netfilter/nfnetlink_queue_core.c:23:
      include/uapi/linux/netfilter.h:76:17: error: field ‘in’ has incomplete type struct in_addr in;
      
      And also explicit include linux/netfilter.h in several spots.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      a263653e
  20. 14 5月, 2015 1 次提交
    • P
      netfilter: add netfilter ingress hook after handle_ing() under unique static key · e687ad60
      Pablo Neira 提交于
      This patch adds the Netfilter ingress hook just after the existing tc ingress
      hook, that seems to be the consensus solution for this.
      
      Note that the Netfilter hook resides under the global static key that enables
      ingress filtering. Nonetheless, Netfilter still also has its own static key for
      minimal impact on the existing handle_ing().
      
      * Without this patch:
      
      Result: OK: 6216490(c6216338+d152) usec, 100000000 (60byte,0frags)
        16086246pps 7721Mb/sec (7721398080bps) errors: 100000000
      
          42.46%  kpktgend_0   [kernel.kallsyms]   [k] __netif_receive_skb_core
          25.92%  kpktgend_0   [kernel.kallsyms]   [k] kfree_skb
           7.81%  kpktgend_0   [pktgen]            [k] pktgen_thread_worker
           5.62%  kpktgend_0   [kernel.kallsyms]   [k] ip_rcv
           2.70%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_internal
           2.34%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_sk
           1.44%  kpktgend_0   [kernel.kallsyms]   [k] __build_skb
      
      * With this patch:
      
      Result: OK: 6214833(c6214731+d101) usec, 100000000 (60byte,0frags)
        16090536pps 7723Mb/sec (7723457280bps) errors: 100000000
      
          41.23%  kpktgend_0      [kernel.kallsyms]  [k] __netif_receive_skb_core
          26.57%  kpktgend_0      [kernel.kallsyms]  [k] kfree_skb
           7.72%  kpktgend_0      [pktgen]           [k] pktgen_thread_worker
           5.55%  kpktgend_0      [kernel.kallsyms]  [k] ip_rcv
           2.78%  kpktgend_0      [kernel.kallsyms]  [k] netif_receive_skb_internal
           2.06%  kpktgend_0      [kernel.kallsyms]  [k] netif_receive_skb_sk
           1.43%  kpktgend_0      [kernel.kallsyms]  [k] __build_skb
      
      * Without this patch + tc ingress:
      
              tc filter add dev eth4 parent ffff: protocol ip prio 1 \
                      u32 match ip dst 4.3.2.1/32
      
      Result: OK: 9269001(c9268821+d179) usec, 100000000 (60byte,0frags)
        10788648pps 5178Mb/sec (5178551040bps) errors: 100000000
      
          40.99%  kpktgend_0   [kernel.kallsyms]  [k] __netif_receive_skb_core
          17.50%  kpktgend_0   [kernel.kallsyms]  [k] kfree_skb
          11.77%  kpktgend_0   [cls_u32]          [k] u32_classify
           5.62%  kpktgend_0   [kernel.kallsyms]  [k] tc_classify_compat
           5.18%  kpktgend_0   [pktgen]           [k] pktgen_thread_worker
           3.23%  kpktgend_0   [kernel.kallsyms]  [k] tc_classify
           2.97%  kpktgend_0   [kernel.kallsyms]  [k] ip_rcv
           1.83%  kpktgend_0   [kernel.kallsyms]  [k] netif_receive_skb_internal
           1.50%  kpktgend_0   [kernel.kallsyms]  [k] netif_receive_skb_sk
           0.99%  kpktgend_0   [kernel.kallsyms]  [k] __build_skb
      
      * With this patch + tc ingress:
      
              tc filter add dev eth4 parent ffff: protocol ip prio 1 \
                      u32 match ip dst 4.3.2.1/32
      
      Result: OK: 9308218(c9308091+d126) usec, 100000000 (60byte,0frags)
        10743194pps 5156Mb/sec (5156733120bps) errors: 100000000
      
          42.01%  kpktgend_0   [kernel.kallsyms]   [k] __netif_receive_skb_core
          17.78%  kpktgend_0   [kernel.kallsyms]   [k] kfree_skb
          11.70%  kpktgend_0   [cls_u32]           [k] u32_classify
           5.46%  kpktgend_0   [kernel.kallsyms]   [k] tc_classify_compat
           5.16%  kpktgend_0   [pktgen]            [k] pktgen_thread_worker
           2.98%  kpktgend_0   [kernel.kallsyms]   [k] ip_rcv
           2.84%  kpktgend_0   [kernel.kallsyms]   [k] tc_classify
           1.96%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_internal
           1.57%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_sk
      
      Note that the results are very similar before and after.
      
      I can see gcc gets the code under the ingress static key out of the hot path.
      Then, on that cold branch, it generates the code to accomodate the netfilter
      ingress static key. My explanation for this is that this reduces the pressure
      on the instruction cache for non-users as the new code is out of the hot path,
      and it comes with minimal impact for tc ingress users.
      
      Using gcc version 4.8.4 on:
      
      Architecture:          x86_64
      CPU op-mode(s):        32-bit, 64-bit
      Byte Order:            Little Endian
      CPU(s):                8
      [...]
      L1d cache:             16K
      L1i cache:             64K
      L2 cache:              2048K
      L3 cache:              8192K
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e687ad60