1. 15 1月, 2020 4 次提交
  2. 15 12月, 2019 1 次提交
  3. 31 5月, 2019 1 次提交
  4. 28 4月, 2019 3 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
    • M
      net: fix two coding style issues · f78c6032
      Michal Kubecek 提交于
      This is a simple cleanup addressing two coding style issues found by
      checkpatch.pl in an earlier patch. It's submitted as a separate patch to
      keep the original patch as it was generated by spatch.
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f78c6032
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de
  5. 17 4月, 2019 1 次提交
  6. 30 3月, 2019 1 次提交
  7. 19 12月, 2018 1 次提交
  8. 13 12月, 2018 2 次提交
  9. 06 12月, 2018 2 次提交
    • N
      net: bridge: mark hash_elasticity as obsolete · cf332bca
      Nikolay Aleksandrov 提交于
      Now that the bridge multicast uses the generic rhashtable interface we
      can drop the hash_elasticity option as that is already done for us and
      it's hardcoded to a maximum of RHT_ELASTICITY (16 currently). Add a
      warning about the obsolete option when the hash_elasticity is set.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf332bca
    • N
      net: bridge: convert multicast to generic rhashtable · 19e3a9c9
      Nikolay Aleksandrov 提交于
      The bridge multicast code currently uses a custom resizable hashtable
      which predates the generic rhashtable interface. It has many
      shortcomings compared and duplicates functionality that is presently
      available via the generic rhashtable, so this patch removes the custom
      rhashtable implementation in favor of the kernel's generic rhashtable.
      The hash maximum is kept and the rhashtable's size is used to do a loose
      check if it's reached in which case we revert to the old behaviour and
      disable further bridge multicast processing. Also now we can support any
      hash maximum, doesn't need to be a power of 2.
      
      v3: add non-rcu br_mdb_get variant and use it where multicast_lock is
          held to avoid RCU splat, drop hash_max function and just set it
          directly
      
      v2: handle when IGMP snooping is undefined, add br_mdb_init/uninit
          placeholders
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19e3a9c9
  10. 28 11月, 2018 1 次提交
    • N
      net: bridge: add support for user-controlled bool options · a428afe8
      Nikolay Aleksandrov 提交于
      We have been adding many new bridge options, a big number of which are
      boolean but still take up netlink attribute ids and waste space in the skb.
      Recently we discussed learning from link-local packets[1] and decided
      yet another new boolean option will be needed, thus introducing this API
      to save some bridge nl space.
      The API supports changing the value of multiple boolean options at once
      via the br_boolopt_multi struct which has an optmask (which options to
      set, bit per opt) and optval (options' new values). Future boolean
      options will only be added to the br_boolopt_id enum and then will have
      to be handled in br_boolopt_toggle/get. The API will automatically
      add the ability to change and export them via netlink, sysfs can use the
      single boolopt function versions to do the same. The behaviour with
      failing/succeeding is the same as with normal netlink option changing.
      
      If an option requires mapping to internal kernel flag or needs special
      configuration to be enabled then it should be handled in
      br_boolopt_toggle. It should also be able to retrieve an option's current
      state via br_boolopt_get.
      
      v2: WARN_ON() on unsupported option as that shouldn't be possible and
          also will help catch people who add new options without handling
          them for both set and get. Pass down extack so if an option desires
          it could set it on error and be more user-friendly.
      
      [1] https://www.spinics.net/lists/netdev/msg532698.htmlSigned-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a428afe8
  11. 13 10月, 2018 1 次提交
    • N
      net: bridge: add support for per-port vlan stats · 9163a0fc
      Nikolay Aleksandrov 提交于
      This patch adds an option to have per-port vlan stats instead of the
      default global stats. The option can be set only when there are no port
      vlans in the bridge since we need to allocate the stats if it is set
      when vlans are being added to ports (and respectively free them
      when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the
      largest user of options. The current stats design allows us to add
      these without any changes to the fast-path, it all comes down to
      the per-vlan stats pointer which, if this option is enabled, will
      be allocated for each port vlan instead of using the global bridge-wide
      one.
      
      CC: bridge@lists.linux-foundation.org
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9163a0fc
  12. 27 9月, 2018 5 次提交
  13. 24 7月, 2018 1 次提交
    • N
      net: bridge: add support for backup port · 2756f68c
      Nikolay Aleksandrov 提交于
      This patch adds a new port attribute - IFLA_BRPORT_BACKUP_PORT, which
      allows to set a backup port to be used for known unicast traffic if the
      port has gone carrier down. The backup pointer is rcu protected and set
      only under RTNL, a counter is maintained so when deleting a port we know
      how many other ports reference it as a backup and we remove it from all.
      Also the pointer is in the first cache line which is hot at the time of
      the check and thus in the common case we only add one more test.
      The backup port will be used only for the non-flooding case since
      it's a part of the bridge and the flooded packets will be forwarded to it
      anyway. To remove the forwarding just send a 0/non-existing backup port.
      This is used to avoid numerous scalability problems when using MLAG most
      notably if we have thousands of fdbs one would need to change all of them
      on port carrier going down which takes too long and causes a storm of fdb
      notifications (and again when the port comes back up). In a Multi-chassis
      Link Aggregation setup usually hosts are connected to two different
      switches which act as a single logical switch. Those switches usually have
      a control and backup link between them called peerlink which might be used
      for communication in case a host loses connectivity to one of them.
      We need a fast way to failover in case a host port goes down and currently
      none of the solutions (like bond) cannot fulfill the requirements because
      the participating ports are actually the "master" devices and must have the
      same peerlink as their backup interface and at the same time all of them
      must participate in the bridge device. As Roopa noted it's normal practice
      in routing called fast re-route where a precalculated backup path is used
      when the main one is down.
      Another use case of this is with EVPN, having a single vxlan device which
      is backup of every port. Due to the nature of master devices it's not
      currently possible to use one device as a backup for many and still have
      all of them participate in the bridge (which is master itself).
      More detailed information about MLAG is available at the link below.
      https://docs.cumulusnetworks.com/display/DOCS/Multi-Chassis+Link+Aggregation+-+MLAG
      
      Further explanation and a diagram by Roopa:
      Two switches acting in a MLAG pair are connected by the peerlink
      interface which is a bridge port.
      
      the config on one of the switches looks like the below. The other
      switch also has a similar config.
      eth0 is connected to one port on the server. And the server is
      connected to both switches.
      
      br0 -- team0---eth0
            |
            -- switch-peerlink
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2756f68c
  14. 26 5月, 2018 1 次提交
  15. 19 12月, 2017 1 次提交
    • N
      net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks · 84aeb437
      Nikolay Aleksandrov 提交于
      The early call to br_stp_change_bridge_id in bridge's newlink can cause
      a memory leak if an error occurs during the newlink because the fdb
      entries are not cleaned up if a different lladdr was specified, also
      another minor issue is that it generates fdb notifications with
      ifindex = 0. Another unrelated memory leak is the bridge sysfs entries
      which get added on NETDEV_REGISTER event, but are not cleaned up in the
      newlink error path. To remove this special case the call to
      br_stp_change_bridge_id is done after netdev register and we cleanup the
      bridge on changelink error via br_dev_delete to plug all leaks.
      
      This patch makes netlink bridge destruction on newlink error the same as
      dellink and ioctl del which is necessary since at that point we have a
      fully initialized bridge device.
      
      To reproduce the issue:
      $ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
      RTNETLINK answers: Invalid argument
      
      $ rmmod bridge
      [ 1822.142525] =============================================================================
      [ 1822.143640] BUG bridge_fdb_cache (Tainted: G           O    ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
      [ 1822.144821] -----------------------------------------------------------------------------
      
      [ 1822.145990] Disabling lock debugging due to kernel taint
      [ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
      [ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G    B      O     4.15.0-rc2+ #87
      [ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [ 1822.150008] Call Trace:
      [ 1822.150510]  dump_stack+0x78/0xa9
      [ 1822.151156]  slab_err+0xb1/0xd3
      [ 1822.151834]  ? __kmalloc+0x1bb/0x1ce
      [ 1822.152546]  __kmem_cache_shutdown+0x151/0x28b
      [ 1822.153395]  shutdown_cache+0x13/0x144
      [ 1822.154126]  kmem_cache_destroy+0x1c0/0x1fb
      [ 1822.154669]  SyS_delete_module+0x194/0x244
      [ 1822.155199]  ? trace_hardirqs_on_thunk+0x1a/0x1c
      [ 1822.155773]  entry_SYSCALL_64_fastpath+0x23/0x9a
      [ 1822.156343] RIP: 0033:0x7f929bd38b17
      [ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
      [ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17
      [ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0
      [ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11
      [ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80
      [ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090
      [ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
      [ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128
      
      Fixes: 30313a3d ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
      Fixes: 5b8d5429 ("bridge: netlink: register netdevice before executing changelink")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84aeb437
  16. 14 11月, 2017 1 次提交
  17. 02 11月, 2017 1 次提交
    • N
      net: bridge: add notifications for the bridge dev on vlan change · 92899063
      Nikolay Aleksandrov 提交于
      Currently the bridge device doesn't generate any notifications upon vlan
      modifications on itself because it doesn't use the generic bridge
      notifications.
      With the recent changes we know if anything was modified in the vlan config
      thus we can generate a notification when necessary for the bridge device
      so add support to br_ifinfo_notify() similar to how other combined
      functions are done - if port is present it takes precedence, otherwise
      notify about the bridge. I've explicitly marked the locations where the
      notification should be always for the port by setting bridge to NULL.
      I've also taken the liberty to rearrange each modified function's local
      variables in reverse xmas tree as well.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92899063
  18. 01 11月, 2017 1 次提交
  19. 29 10月, 2017 2 次提交
  20. 22 10月, 2017 1 次提交
  21. 09 10月, 2017 1 次提交
  22. 29 9月, 2017 1 次提交
    • N
      net: bridge: add per-port group_fwd_mask with less restrictions · 5af48b59
      Nikolay Aleksandrov 提交于
      We need to be able to transparently forward most link-local frames via
      tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a
      mask which restricts the forwarding of STP and LACP, but we need to be able
      to forward these over tunnels and control that forwarding on a per-port
      basis thus add a new per-port group_fwd_mask option which only disallows
      mac pause frames to be forwarded (they're always dropped anyway).
      The patch does not change the current default situation - all of the others
      are still restricted unless configured for forwarding.
      We have successfully tested this patch with LACP and STP forwarding over
      VxLAN and qinq tunnels.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5af48b59
  23. 27 6月, 2017 4 次提交
  24. 09 6月, 2017 1 次提交
  25. 07 6月, 2017 1 次提交