1. 11 10月, 2018 23 次提交
  2. 10 10月, 2018 2 次提交
  3. 09 10月, 2018 15 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 071a234a
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf-next 2018-10-08
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      The main changes are:
      
      1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
      BPF programs to perform lookups for sockets in a network namespace. This would
      allow programs to determine early on in processing whether the stack is
      expecting to receive the packet, and perform some action (eg drop,
      forward somewhere) based on this information.
      
      2) per-cpu cgroup local storage from Roman Gushchin.
      Per-cpu cgroup local storage is very similar to simple cgroup storage
      except all the data is per-cpu. The main goal of per-cpu variant is to
      implement super fast counters (e.g. packet counters), which don't require
      neither lookups, neither atomic operations in a fast path.
      The example of these hybrid counters is in selftests/bpf/netcnt_prog.c
      
      3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet
      
      4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski
      
      5) rename of libbpf interfaces from Andrey Ignatov.
      libbpf is maturing as a library and should follow good practices in
      library design and implementation to play well with other libraries.
      This patch set brings consistent naming convention to global symbols.
      
      6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
      to let Apache2 projects use libbpf
      
      7) various AF_XDP fixes from Björn and Magnus
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      071a234a
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 9000a457
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      The following patchset contains Netfilter updates for your net-next tree:
      
      1) Support for matching on ipsec policy already set in the route, from
         Florian Westphal.
      
      2) Split set destruction into deactivate and destroy phase to make it
         fit better into the transaction infrastructure, also from Florian.
         This includes a patch to warn on imbalance when setting the new
         activate and deactivate interfaces.
      
      3) Release transaction list from the workqueue to remove expensive
         synchronize_rcu() from configuration plane path. This speeds up
         configuration plane quite a bit. From Florian Westphal.
      
      4) Add new xfrm/ipsec extension, this new extension allows you to match
         for ipsec tunnel keys such as source and destination address, spi and
         reqid. From Máté Eckl and Florian Westphal.
      
      5) Add secmark support, this includes connsecmark too, patches
         from Christian Gottsche.
      
      6) Allow to specify remaining bytes in xt_quota, from Chenbo Feng.
         One follow up patch to calm a clang warning for this one, from
         Nathan Chancellor.
      
      7) Flush conntrack entries based on layer 3 family, from Kristian Evensen.
      
      8) New revision for cgroups2 to shrink the path field.
      
      9) Get rid of obsolete need_conntrack(), as a result from recent
         demodularization works.
      
      10) Use WARN_ON instead of BUG_ON, from Florian Westphal.
      
      11) Unused exported symbol in nf_nat_ipv4_fn(), from Florian.
      
      12) Remove superfluous check for timeout netlink parser and dump
          functions in layer 4 conntrack helpers.
      
      13) Unnecessary redundant rcu read side locks in NAT redirect,
          from Taehee Yoo.
      
      14) Pass nf_hook_state structure to error handlers, patch from
          Florian Westphal.
      
      15) Remove ->new() interface from layer 4 protocol trackers. Place
          them in the ->packet() interface. From Florian.
      
      16) Place conntrack ->error() handling in the ->packet() interface.
          Patches from Florian Westphal.
      
      17) Remove unused parameter in the pernet initialization path,
          also from Florian.
      
      18) Remove additional parameter to specify layer 3 protocol when
          looking up for protocol tracker. From Florian.
      
      19) Shrink array of layer 4 protocol trackers, from Florian.
      
      20) Check for linear skb only once from the ALG NAT mangling
          codebase, from Taehee Yoo.
      
      21) Use rhashtable_walk_enter() instead of deprecated
          rhashtable_walk_init(), also from Taehee.
      
      22) No need to flush all conntracks when only one single address
          is gone, from Tan Hu.
      
      23) Remove redundant check for NAT flags in flowtable code, from
          Taehee Yoo.
      
      24) Use rhashtable_lookup() instead of rhashtable_lookup_fast()
          from netfilter codebase, since rcu read lock side is already
          assumed in this path.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9000a457
    • A
      bpf: fix building without CONFIG_INET · df3f94a0
      Arnd Bergmann 提交于
      The newly added TCP and UDP handling fails to link when CONFIG_INET
      is disabled:
      
      net/core/filter.o: In function `sk_lookup':
      filter.c:(.text+0x7ff8): undefined reference to `tcp_hashinfo'
      filter.c:(.text+0x7ffc): undefined reference to `tcp_hashinfo'
      filter.c:(.text+0x8020): undefined reference to `__inet_lookup_established'
      filter.c:(.text+0x8058): undefined reference to `__inet_lookup_listener'
      filter.c:(.text+0x8068): undefined reference to `udp_table'
      filter.c:(.text+0x8070): undefined reference to `udp_table'
      filter.c:(.text+0x808c): undefined reference to `__udp4_lib_lookup'
      net/core/filter.o: In function `bpf_sk_release':
      filter.c:(.text+0x82e8): undefined reference to `sock_gen_put'
      
      Wrap the related sections of code in #ifdefs for the config option.
      
      Furthermore, sk_lookup() should always have been marked 'static', this
      also avoids a warning about a missing prototype when building with
      'make W=1'.
      
      Fixes: 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJoe Stringer <joe@wand.net.nz>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      df3f94a0
    • N
      netfilter: xt_quota: Don't use aligned attribute in sizeof · ffa0a9a5
      Nathan Chancellor 提交于
      Clang warns:
      
      net/netfilter/xt_quota.c:47:44: warning: 'aligned' attribute ignored
      when parsing type [-Wignored-attributes]
              BUILD_BUG_ON(sizeof(atomic64_t) != sizeof(__aligned_u64));
                                                        ^~~~~~~~~~~~~
      
      Use 'sizeof(__u64)' instead, as the alignment doesn't affect the size
      of the type.
      
      Fixes: e9837e55 ("netfilter: xt_quota: fix the behavior of xt_quota module")
      Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ffa0a9a5
    • I
      dpaa2-eth: Don't account Tx confirmation frames on NAPI poll · 68049a5f
      Ioana Ciocoi Radulescu 提交于
      Until now, both Rx and Tx confirmation frames handled during
      NAPI poll were counted toward the NAPI budget. However, Tx
      confirmations are lighter to process than Rx frames, which can
      skew the amount of work actually done inside one NAPI cycle.
      
      Update the code to only count Rx frames toward the NAPI budget
      and set a separate threshold on how many Tx conf frames can be
      processed in one poll cycle.
      
      The NAPI poll routine stops when either the budget is consumed
      by Rx frames or when Tx confirmation frames reach this threshold.
      Signed-off-by: NIoana Radulescu <ruxandra.radulescu@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68049a5f
    • Y
      net: mscc: ocelot: remove set but not used variable 'phy_mode' · 9e19dabc
      YueHaibing 提交于
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      drivers/net/ethernet/mscc/ocelot_board.c: In function 'mscc_ocelot_probe':
      drivers/net/ethernet/mscc/ocelot_board.c:262:17: warning:
       variable 'phy_mode' set but not used [-Wunused-but-set-variable]
         enum phy_mode phy_mode;
      
      It never used since introduction in
      commit 71e32a20 ("net: mscc: ocelot: make use of SerDes PHYs for handling their configuration")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e19dabc
    • D
      Merge branch 'more-pmtu-selftests' · ee9615be
      David S. Miller 提交于
      Sabrina Dubroca says:
      
      ====================
      selftests: add more PMTU tests
      
      The current selftests for PMTU cover VTI tunnels, but there's nothing
      about the generation and handling of PMTU exceptions by intermediate
      routers. This series adds and improves existing helpers, then adds
      IPv4 and IPv6 selftests with a setup involving an intermediate router.
      
      Joint work with Stefano Brivio.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee9615be
    • S
      selftests: pmtu: add basic IPv4 and IPv6 PMTU tests · e44e428f
      Sabrina Dubroca 提交于
      Commit d1f1b9cb ("selftests: net: Introduce first PMTU test") and
      follow-ups introduced some PMTU tests, but they all rely on tunneling,
      and, particularly, on VTI.
      
      These new tests use simple routing to exercise the generation and
      update of PMTU exceptions in IPv4 and IPv6.
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e44e428f
    • S
      selftests: pmtu: extend MTU parsing helper to locked MTU · 72ebddd7
      Sabrina Dubroca 提交于
      The mtu_parse helper introduced in commit f2c929fe ("selftests:
      pmtu: Factor out MTU parsing helper") can only handle "mtu 1234", but
      not "mtu lock 1234". Extend it, so that we can do IPv4 tests with PMTU
      smaller than net.ipv4.route.min_pmtu
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72ebddd7
    • S
      selftests: pmtu: Introduce check_pmtu_value() · 1e0a7207
      Stefano Brivio 提交于
      Introduce and use a function that checks PMTU values against
      expected values and logs error messages, to remove some clutter.
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e0a7207
    • G
      isdn/gigaset/isocdata: mark expected switch fall-through · 062f97a3
      Gustavo A. R. Silva 提交于
      Notice that in this particular case, I replaced the
      "--v-- fall through --v--" comment with a proper
      "fall through", which is what GCC is expecting to
      find.
      
      This fix is part of the ongoing efforts to enabling
      -Wimplicit-fallthrough
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      062f97a3
    • D
      Merge branch 'rtnetlink-Add-support-for-rigid-checking-of-data-in-dump-request' · cd7f7df6
      David S. Miller 提交于
      David Ahern says:
      
      ====================
      rtnetlink: Add support for rigid checking of data in dump request
      
      There are many use cases where a user wants to influence what is
      returned in a dump for some rtnetlink command: one is wanting data
      for a different namespace than the one the request is received and
      another is limiting the amount of data returned in the dump to a
      specific set of interest to userspace, reducing the cpu overhead of
      both kernel and userspace. Unfortunately, the kernel has historically
      not been strict with checking for the proper header or checking the
      values passed in the header. This lenient implementation has allowed
      iproute2 and other packages to pass any struct or data in the dump
      request as long as the family is the first byte. For example, ifinfomsg
      struct is used by iproute2 for all generic dump requests - links,
      addresses, routes and rules when it is really only valid for link
      requests.
      
      There is 1 is example where the kernel deals with the wrong struct: link
      dumps after VF support was added. Older iproute2 was sending rtgenmsg as
      the header instead of ifinfomsg so a patch was added to try and detect
      old userspace vs new:
      e5eca6d4 ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0")
      
      The latest example is Christian's patch set wanting to return addresses for
      a target namespace. It guesses the header struct is an ifaddrmsg and if it
      guesses wrong a netlink warning is generated in the kernel log on every
      address dump which is unacceptable.
      
      Another example where the kernel is a bit lenient is route dumps: iproute2
      can send either a request with either ifinfomsg or a rtmsg as the header
      struct, yet the kernel always treats the header as an rtmsg (see
      inet_dump_fib and rtm_flags check). The header inconsistency impacts the
      ability to add kernel side filters for route dumps - a necessary feature
      for scale setups with 100k+ routes.
      
      How to resolve the problem of not breaking old userspace yet be able to
      move forward with new features such as kernel side filtering which are
      crucial for efficient operation at high scale?
      
      This patch set addresses the problem by adding a new socket flag,
      NETLINK_DUMP_STRICT_CHK, that userspace can use with setsockopt to
      request strict checking of headers and attributes on dump requests and
      hence unlock the ability to use kernel side filters as they are added.
      
      Kernel side, the dump handlers are updated to verify the message contains
      at least the expected header struct:
          RTM_GETLINK:       ifinfomsg
          RTM_GETADDR:       ifaddrmsg
          RTM_GETMULTICAST:  ifaddrmsg
          RTM_GETANYCAST:    ifaddrmsg
          RTM_GETADDRLABEL:  ifaddrlblmsg
          RTM_GETROUTE:      rtmsg
          RTM_GETSTATS:      if_stats_msg
          RTM_GETNEIGH:      ndmsg
          RTM_GETNEIGHTBL:   ndtmsg
          RTM_GETNSID:       rtgenmsg
          RTM_GETRULE:       fib_rule_hdr
          RTM_GETNETCONF:    netconfmsg
          RTM_GETMDB:        br_port_msg
      
      And then every field in the header struct should be 0 with the exception
      of the family. There are a few exceptions to this rule where the kernel
      already influences the data returned by values in the struct. Next the
      message should not contain attributes unless the kernel implements
      filtering for it. Any unexpected data causes the dump to fail with EINVAL.
      If the new flag is honored by the kernel and the dump contents adjusted
      by any data passed in the request, the dump handler can set the
      NLM_F_DUMP_FILTERED flag in the netlink message header.
      
      For old userspace on new kernel there is no impact as all checks are
      wrapped in a check on the new strict flag. For new userspace on old
      kernel, the data in the headers and any appended attributes are
      silently ignored though the setsockopt failing is the clue to userspace
      the feature is not supported. New userspace on new kernel gets the
      requested data dump.
      
      iproute2 patches can be found here:
          https://github.com/dsahern/iproute2 dump-enhancements
      
      Major changes since v1
      - inner header is supposed to be 4-bytes aligned. So for dumps that
        should not have attributes appended changed the check to use:
              if (nlmsg_attrlen(nlh, sizeof(hdr)))
        Only impacts patches with headers that are not multiples of 4-bytes
        (rtgenmsg, netconfmsg), but applied the change to all patches not
        calling nlmsg_parse for consistency.
      
      - Added nlmsg_parse_strict and nla_parse_strict for tighter control on
        attribute parsing. There should be no unknown attribute types or extra
        bytes.
      
      - Moved validation to a helper in most cases
      
      Changes since rfc-v2
      - dropped the NLM_F_DUMP_FILTERED flag from target nsid dumps per
        Jiri's objections
      - changed the opt-in uapi from a netlink message flag to a socket
        flag. setsockopt provides an api for userspace to definitively
        know if the kernel supports strict checking on dumps.
      - re-ordered patches to peel off the extack on dumps if needed to
        keep this set size within limits
      - misc cleanups in patches based on testing
      ====================
      Acked-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd7f7df6
    • D
      rtnetlink: Update rtnl_fdb_dump for strict data checking · 8c6e137f
      David Ahern 提交于
      Update rtnl_fdb_dump for strict data checking. If the flag is set,
      the dump request is expected to have an ndmsg struct as the header
      potentially followed by one or more attributes. Any data passed in the
      header or as an attribute is taken as a request to influence the data
      returned. Only values supported by the dump handler are allowed to be
      non-0 or set in the request. At the moment only the NDA_IFINDEX and
      NDA_MASTER attributes are supported.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c6e137f
    • D
      rtnetlink: Move input checking for rtnl_fdb_dump to helper · 8dfbda19
      David Ahern 提交于
      Move the existing input checking for rtnl_fdb_dump into a helper,
      valid_fdb_dump_legacy. This function will retain the current
      logic that works around the 2 headers that userspace has been
      allowed to send up to this point.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8dfbda19
    • D
      net/bridge: Update br_mdb_dump for strict data checking · c77b9364
      David Ahern 提交于
      Update br_mdb_dump for strict data checking. If the flag is set,
      the dump request is expected to have a br_port_msg struct as the
      header. All elements of the struct are expected to be 0 and no
      attributes can be appended.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c77b9364