1. 09 2月, 2019 1 次提交
    • J
      devlink: publish params only after driver init is done · 7c62cfb8
      Jiri Pirko 提交于
      Currently, user can do dump or get of param values right after the
      devlink params are registered. However the driver may not be initialized
      which is an issue. The same problem happens during notification
      upon param registration. Allow driver to publish devlink params
      whenever it is ready to handle get() ops. Note that this cannot
      be resolved by init reordering, as the "driverinit" params have
      to be available before the driver is initialized (it needs the param
      values there).
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c62cfb8
  2. 08 2月, 2019 3 次提交
    • E
      devlink: Add health report functionality · c8e1da0b
      Eran Ben Elisha 提交于
      Upon error discover, every driver can report it to the devlink health
      mechanism via devlink_health_report function, using the appropriate
      reporter registered to it. Driver can pass error specific context which
      will be delivered to it as part of the dump / recovery callbacks.
      
      Once an error is reported, devlink health will do the following actions:
      * A log is being send to the kernel trace events buffer
      * Health status and statistics are being updated for the reporter instance
      * Object dump is being taken and stored at the reporter instance (as long
        as there is no other dump which is already stored)
      * Auto recovery attempt is being done. Depends on:
        - Auto Recovery configuration
        - Grace period vs. Time since last recover
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8e1da0b
    • E
      devlink: Add health reporter create/destroy functionality · a0bdcc59
      Eran Ben Elisha 提交于
      Devlink health reporter is an instance for reporting, diagnosing and
      recovering from run time errors discovered by the reporters.
      Define it's data structure and supported operations.
      In addition, expose devlink API to create and destroy a reporter.
      Each devlink instance will hold it's own reporters list.
      
      As part of the allocation, driver shall provide a set of callbacks which
      will be used by devlink in order to handle health reports and user
      commands related to this reporter. In addition, driver is entitled to
      provide some priv pointer, which can be fetched from the reporter by
      devlink_health_reporter_priv function.
      
      For each reporter, devlink will hold a metadata of statistics,
      dump msg and status.
      
      For passing dumps and diagnose data to the user-space, it will use devlink
      fmsg API.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0bdcc59
    • E
      devlink: Add devlink formatted message (fmsg) API · 1db64e87
      Eran Ben Elisha 提交于
      Devlink fmsg is a mechanism to pass descriptors between drivers and
      devlink, in json-like format. The API allows the driver to add nested
      attributes such as object, object pair and value array, in addition to
      attributes such as name and value.
      
      Driver can use this API to fill the fmsg context in a format which will be
      translated by the devlink to the netlink message later.
      There is no memory allocation in advance (other than the initial list
      head), and it dynamically allocates messages descriptors and add them to
      the list on the fly.
      
      When it needs to send the data using SKBs to the netlink layer, it
      fragments the data between different SKBs. In order to do this
      fragmentation, it uses virtual nests attributes, to avoid actual
      nesting use which cannot be divided between different SKBs.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1db64e87
  3. 07 2月, 2019 7 次提交
  4. 05 2月, 2019 1 次提交
    • P
      netfilter: nf_tables: unbind set in rule from commit path · f6ac8585
      Pablo Neira Ayuso 提交于
      Anonymous sets that are bound to rules from the same transaction trigger
      a kernel splat from the abort path due to double set list removal and
      double free.
      
      This patch updates the logic to search for the transaction that is
      responsible for creating the set and disable the set list removal and
      release, given the rule is now responsible for this. Lookup is reverse
      since the transaction that adds the set is likely to be at the tail of
      the list.
      
      Moreover, this patch adds the unbind step to deliver the event from the
      commit path.  This should not be done from the worker thread, since we
      have no guarantees of in-order delivery to the listener.
      
      This patch removes the assumption that both activate and deactivate
      callbacks need to be provided.
      
      Fixes: cd5125d8 ("netfilter: nf_tables: split set destruction in deactivate and destroy phase")
      Reported-by: NMikhail Morfikov <mmorfikov@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f6ac8585
  5. 04 2月, 2019 2 次提交
  6. 02 2月, 2019 8 次提交
  7. 01 2月, 2019 5 次提交
  8. 31 1月, 2019 1 次提交
    • D
      ipvlan, l3mdev: fix broken l3s mode wrt local routes · d5256083
      Daniel Borkmann 提交于
      While implementing ipvlan l3 and l3s mode for kubernetes CNI plugin,
      I ran into the issue that while l3 mode is working fine, l3s mode
      does not have any connectivity to kube-apiserver and hence all pods
      end up in Error state as well. The ipvlan master device sits on
      top of a bond device and hostns traffic to kube-apiserver (also running
      in hostns) is DNATed from 10.152.183.1:443 to 139.178.29.207:37573
      where the latter is the address of the bond0. While in l3 mode, a
      curl to https://10.152.183.1:443 or to https://139.178.29.207:37573
      works fine from hostns, neither of them do in case of l3s. In the
      latter only a curl to https://127.0.0.1:37573 appeared to work where
      for local addresses of bond0 I saw kernel suddenly starting to emit
      ARP requests to query HW address of bond0 which remained unanswered
      and neighbor entries in INCOMPLETE state. These ARP requests only
      happen while in l3s.
      
      Debugging this further, I found the issue is that l3s mode is piggy-
      backing on l3 master device, and in this case local routes are using
      l3mdev_master_dev_rcu(dev) instead of net->loopback_dev as per commit
      f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev
      if relevant") and 5f02ce24 ("net: l3mdev: Allow the l3mdev to be
      a loopback"). I found that reverting them back into using the
      net->loopback_dev fixed ipvlan l3s connectivity and got everything
      working for the CNI.
      
      Now judging from 4fbae7d8 ("ipvlan: Introduce l3s mode") and the
      l3mdev paper in [0] the only sole reason why ipvlan l3s is relying
      on l3 master device is to get the l3mdev_ip_rcv() receive hook for
      setting the dst entry of the input route without adding its own
      ipvlan specific hacks into the receive path, however, any l3 domain
      semantics beyond just that are breaking l3s operation. Note that
      ipvlan also has the ability to dynamically switch its internal
      operation from l3 to l3s for all ports via ipvlan_set_port_mode()
      at runtime. In any case, l3 vs l3s soley distinguishes itself by
      'de-confusing' netfilter through switching skb->dev to ipvlan slave
      device late in NF_INET_LOCAL_IN before handing the skb to L4.
      
      Minimal fix taken here is to add a IFF_L3MDEV_RX_HANDLER flag which,
      if set from ipvlan setup, gets us only the wanted l3mdev_l3_rcv() hook
      without any additional l3mdev semantics on top. This should also have
      minimal impact since dev->priv_flags is already hot in cache. With
      this set, l3s mode is working fine and I also get things like
      masquerading pod traffic on the ipvlan master properly working.
      
        [0] https://netdevconf.org/1.2/papers/ahern-what-is-l3mdev-paper.pdf
      
      Fixes: f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
      Fixes: 5f02ce24 ("net: l3mdev: Allow the l3mdev to be a loopback")
      Fixes: 4fbae7d8 ("ipvlan: Introduce l3s mode")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: David Ahern <dsa@cumulusnetworks.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Martynas Pumputis <m@lambda.lt>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5256083
  9. 30 1月, 2019 7 次提交
  10. 29 1月, 2019 1 次提交
    • D
      net: tls: Save iv in tls_rec for async crypto requests · 32eb67b9
      Dave Watson 提交于
      aead_request_set_crypt takes an iv pointer, and we change the iv
      soon after setting it.  Some async crypto algorithms don't save the iv,
      so we need to save it in the tls_rec for async requests.
      
      Found by hardcoding x64 aesni to use async crypto manager (to test the async
      codepath), however I don't think this combination can happen in the wild.
      Presumably other hardware offloads will need this fix, but there have been
      no user reports.
      
      Fixes: a42055e8 ("Add support for async encryption of records...")
      Signed-off-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32eb67b9
  11. 28 1月, 2019 3 次提交
  12. 27 1月, 2019 1 次提交