1. 17 1月, 2019 17 次提交
    • D
      Merge branch 'nfp-flower-improve-flower-resilience' · 159882f4
      David S. Miller 提交于
      Jakub Kicinski says:
      
      ====================
      nfp: flower: improve flower resilience
      
      This series contains mostly changes which improve nfp flower
      offload's resilience, but are too large or risky to push into net.
      
      Fred makes the driver waits for flower FW responses uninterruptible,
      and a little longer (~40ms).
      
      Pieter adds support for cards with multiple rule memories.
      
      John reworks the MAC offloads.  He says:
      > When potential tunnel end-point MACs are offloaded, they are assigned an
      > index. This index may be associated with a port number meaning that if a
      > packet matches an offloaded MAC address on the card, then the ingress
      > port for that MAC can also be verified. In the case of shared MACs (e.g.
      > on a linux bond) there may be situations where this index maps to only
      > one of the ports that share the MAC.
      >
      > The idea of 'global' MAC indexes are supported that bypass the check on
      > ingress port on the NFP. The patchset tracks shared MACs and assigns
      > global indexes to these. It also ensures that port based indexes are
      > re-applied if a single port becomes the only user of an offloaded MAC.
      >
      > Other patches in the set aim to tidy code without changing functionality.
      > There is also a delete offload message introduced to ensure that MACs no
      > longer in use in kernel space are removed from the firmware lookup tables.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      159882f4
    • J
      nfp: flower: enable MAC address sharing for offloadable devs · 20cce886
      John Hurley 提交于
      A MAC address is not necessarily a unique identifier for a netdev. Drivers
      such as Linux bonds, for example, can apply the same MAC address to the
      upper layer device and all lower layer devices.
      
      NFP MAC offload for tunnel decap includes port verification for reprs but
      also supports the offload of non-repr MAC addresses by assigning 'global'
      indexes to these. This means that the FW will not verify the incoming port
      of a packet matching this destination MAC.
      
      Modify the MAC offload logic to assign global indexes based on MAC address
      instead of net device (as it currently does). Use this to allow multiple
      devices to share the same MAC. In other words, if a repr shares its MAC
      address with another device then give the offloaded MAC a global index
      rather than associate it with an ingress port. Track this so that changes
      can be reverted as MACs stop being shared.
      
      Implement this by removing the current list based assignment of global
      indexes and replacing it with an rhashtable that maps an offloaded MAC
      address to the number of devices sharing it, distributing global indexes
      based on this.
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20cce886
    • J
      nfp: flower: ensure MAC cleanup on address change · 13cf7103
      John Hurley 提交于
      It is possible to receive a MAC address change notification without the
      net device being down (e.g. when an OvS bridge is assigned the same MAC as
      a port added to it). This means that an offloaded MAC address may not be
      removed if its device gets a new address.
      
      Maintain a record of the offloaded MAC addresses for each repr and netdev
      assigned a MAC offload index. Use this to delete the (now expired) MAC if
      a change of address event occurs. Only handle change address events if the
      device is already up - if not then the netdev up event will handle it.
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13cf7103
    • J
      nfp: flower: add infastructure for non-repr priv data · 05d2bee6
      John Hurley 提交于
      NFP repr netdevs contain private data that can store per port information.
      In certain cases, the NFP driver offloads information from non-repr ports
      (e.g. tunnel ports). As the driver does not have control over non-repr
      netdevs, it cannot add/track private data directly to the netdev struct.
      
      Add infastructure to store private information on any non-repr netdev that
      is offloaded at a given time. This is used in a following patch to track
      offloaded MAC addresses for non-reprs and enable correct house keeping on
      address changes.
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05d2bee6
    • J
      nfp: flower: ensure deletion of old offloaded MACs · 49402b0b
      John Hurley 提交于
      When a potential tunnel end point goes down then its MAC address should
      not be matchable on the NFP.
      
      Implement a delete message for offloaded MACs and call this on net device
      down. While at it, remove the actions on register and unregister netdev
      events. A MAC should only be offloaded if the device is up. Note that the
      netdev notifier will replay any notifications for UP devices on
      registration so NFP can still offload ports that exist before the driver
      is loaded. Similarly, devices need to go down before they can be
      unregistered so removal of offloaded MACs is only required on down events.
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49402b0b
    • J
      nfp: flower: remove list infastructure from MAC offload · 0115dcc3
      John Hurley 提交于
      Potential MAC destination addresses for tunnel end-points are offloaded to
      firmware. This was done by building a list of such MACs and writing to
      firmware as blocks of addresses.
      
      Simplify this code by removing the list format and sending a new message
      for each offloaded MAC.
      
      This is in preparation for delete MAC messages. There will be one delete
      flag per message so we cannot assume that this applies to all addresses
      in a list.
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0115dcc3
    • J
      nfp: flower: ignore offload of VF and PF repr MAC addresses · 41da0b5e
      John Hurley 提交于
      Currently MAC addresses of all repr netdevs, along with selected non-NFP
      controlled netdevs, are offloaded to FW as potential tunnel end-points.
      However, the addresses of VF and PF reprs are meaningless outside of
      internal communication and it is only those of physical port reprs
      required.
      
      Modify the MAC address offload selection code to ignore VF/PF repr devs.
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41da0b5e
    • J
      nfp: flower: tidy tunnel related private data · f3b97577
      John Hurley 提交于
      Recent additions to the flower app private data have grouped the variables
      of a given feature into a struct and added that struct to the main private
      data struct.
      
      In keeping with this, move all tunnel related private data to their own
      struct. This has no affect on functionality but improves readability and
      maintenance of the code.
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3b97577
    • P
      nfp: flower: support multiple memory units for filter offloads · 467322e2
      Pieter Jansen van Vuuren 提交于
      Adds support for multiple memory units which are used for filter
      offloads. Each filter is assigned a stats id, the MSBs of the id are
      used to determine which memory unit the filter should be offloaded
      to. The number of available memory units that could be used for filter
      offload is obtained from HW. A simple round robin technique is used to
      allocate and distribute the ids across memory units.
      Signed-off-by: NPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      467322e2
    • F
      nfp: flower: increase cmesg reply timeout · 96439889
      Fred Lotter 提交于
      QA tests report occasional timeouts on REIFY message replies. Profiling
      of the two cmesg reply types under burst conditions, with a 12-core host
      under heavy cpu and io load (stress --cpu 12 --io 12), show both PHY MTU
      change and REIFY replies can exceed the 10ms timeout. The maximum MTU
      reply wait under burst is 16ms, while the maximum REIFY wait under 40 VF
      burst is 12ms. Using a 4 VF REIFY burst results in an 8ms maximum wait.
      A larger VF burst does increase the delay, but not in a linear enough
      way to justify a scaled REIFY delay. The worse case values between
      MTU and REIFY appears close enough to justify a common timeout. Pick a
      conservative 40ms to make a safer future proof common reply timeout. The
      delay only effects the failure case.
      
      Change the REIFY timeout mechanism to use wait_event_timeout() instead
      of wait_event_interruptible_timeout(), to match the MTU code. In the
      current implementation, theoretically, a signal could interrupt the
      REIFY waiting period, with a return code of ERESTARTSYS. However, this is
      caught under the general timeout error code EIO. I cannot see the benefit
      of exposing the REIFY waiting period to signals with such a short delay
      (40ms), while the MTU mechnism does not use the same logic. In the absence
      of any reply (wakeup() call), both reply types will wake up the task after
      the timeout period. The REIFY timeout applies to the entire representor
      group being instantiated (e.g. VFs), while the MTU timeout apples to a
      single PHY MTU change.
      Signed-off-by: NFred Lotter <frederik.lotter@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96439889
    • C
      net: sungem: fix indentation, remove a tab · bdbe8cc1
      Colin Ian King 提交于
      The declaration of variable 'found' is one level too deep, fix this by
      removing a tab.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdbe8cc1
    • C
      drivers: net: atp: fix various indentation issues · eedfb223
      Colin Ian King 提交于
      There are various lines that have indentation issues, fix these.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eedfb223
    • C
      bnx2x: fix various indentation issues · 9fb0969f
      Colin Ian King 提交于
      There are lines that have indentation issues, fix these.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fb0969f
    • R
      networking: Documentation: fix snmp_counters.rst Sphinx warnings · ae5220c6
      Randy Dunlap 提交于
      Fix over 100 documentation warnings in snmp_counter.rst by
      extending the underline string lengths and inserting a blank line
      after bullet items.
      
      Examples:
      
      Documentation/networking/snmp_counter.rst:1: WARNING: Title overline too short.
      Documentation/networking/snmp_counter.rst:14: WARNING: Bullet list ends without a blank line; unexpected unindent.
      
      Fixes: 2b965472 ("add document for TCP OFO, PAWS and skip ACK counters")
      Fixes: 8e2ea53a ("add snmp counters document")
      Fixes: 712ee16c ("add documents for snmp counters")
      Fixes: 80cc4950 ("net: Add part of TCP counts explanations in snmp_counters.rst")
      Fixes: b08794a9 ("documentation of some IP/ICMP snmp counters")
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: yupeng <yupeng0921@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae5220c6
    • G
      net, decnet: use struct_size() in kzalloc() · bb3e16ad
      Gustavo A. R. Silva 提交于
      One of the more common cases of allocation size calculations is finding the
      size of a structure that has a zero-sized array at the end, along with memory
      for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          struct boo entry[];
      };
      
      instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can now
      use the new struct_size() helper:
      
      instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This code was detected with the help of Coccinelle.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb3e16ad
    • G
      mlxsw: spectrum_nve: Use struct_size() in kzalloc() · faa311e9
      Gustavo A. R. Silva 提交于
      One of the more common cases of allocation size calculations is finding
      the size of a structure that has a zero-sized array at the end, along
      with memory for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          struct boo entry[];
      };
      
      instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can
      now use the new struct_size() helper:
      
      instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This issue was detected with the help of Coccinelle.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      faa311e9
    • G
      mlxsw: spectrum_acl_bloom_filter: use struct_size() in kzalloc() · 2285ec87
      Gustavo A. R. Silva 提交于
      One of the more common cases of allocation size calculations is finding
      the size of a structure that has a zero-sized array at the end, along
      with memory for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          void *entry[];
      };
      
      instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can
      now use the new struct_size() helper:
      
      instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This issue was detected with the help of Coccinelle.
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2285ec87
  2. 16 1月, 2019 21 次提交
  3. 15 1月, 2019 2 次提交
    • M
      sbitmap: Protect swap_lock from hardirq · fe76fc6a
      Ming Lei 提交于
      Because we may call blk_mq_get_driver_tag() directly from
      blk_mq_dispatch_rq_list() without holding any lock, then HARDIRQ may
      come and the above DEADLOCK is triggered.
      
      Commit ab53dcfb3e7b ("sbitmap: Protect swap_lock from hardirq") tries to
      fix this issue by using 'spin_lock_bh', which isn't enough because we
      complete request from hardirq context direclty in case of multiqueue.
      
      Cc: Clark Williams <williams@redhat.com>
      Fixes: ab53dcfb3e7b ("sbitmap: Protect swap_lock from hardirq")
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe76fc6a
    • S
      sbitmap: Protect swap_lock from softirqs · 37198768
      Steven Rostedt (VMware) 提交于
      The swap_lock used by sbitmap has a chain with locks taken from softirq,
      but the swap_lock is not protected from being preempted by softirqs.
      
      A chain exists of:
      
       sbq->ws[i].wait -> dispatch_wait_lock -> swap_lock
      
      Where the sbq->ws[i].wait lock can be taken from softirq context, which
      means all locks below it in the chain must also be protected from
      softirqs.
      Reported-by: NClark Williams <williams@redhat.com>
      Fixes: 58ab5e32 ("sbitmap: silence bogus lockdep IRQ warning")
      Fixes: ea86ea2c ("sbitmap: amortize cost of clearing bits")
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      37198768