1. 07 4月, 2016 7 次提交
    • I
      mlxsw: reg: Add QoS ETS Element Configuration register · b9b7cee4
      Ido Schimmel 提交于
      We are going to introduce support for DCB, so we need to be able to
      configure the traffic selection algorithm (TSA) used by each traffic
      class (TC), as well as the bandwidth percentage allocated to each TC in
      case of ETS.
      
      Add the QoS ETS Element Configuration register, which controls the
      above parameters.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9b7cee4
    • I
      mlxsw: spectrum: Set port's shared buffer size to 0 · d6b7c13b
      Ido Schimmel 提交于
      In addition to the priority group (PG) buffers in the headroom, the
      device enables the allocation of headroom shared buffer, which can
      be shared between different PGs.
      
      However, we are not going to use the headroom shared buffer and instead
      allow the user to use its size for PGs or the switch's shared buffer.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6b7c13b
    • I
      mlxsw: reg: Use correct PBMC register length · 7ad7cd61
      Ido Schimmel 提交于
      The last field of the PBMC register is at offset 0x64 and its size is
      0x8, so the correct register's length is 0x6C bytes.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ad7cd61
    • I
      mlxsw: spectrum: Correctly configure headroom size · ff6551ec
      Ido Schimmel 提交于
      When packets ingress the switch they are assigned a switch priority and
      directed to the corresponding priority group (PG) buffer in the port's
      headroom buffer.
      
      Since we now map all switch priorities to priority group 0 (PG0) by
      default, there is no need to allocate the other priority groups during
      initialization. The only exception is PG9, which is used for control
      traffic.
      
      At minimum, the PG should be able to store the currently classified
      packet (pipeline latency isn't 0) and also the packets arriving during
      the classification time. However, an incoming packet will not be
      buffered if there is no available MTU-sized buffer space for storing it.
      
      The buffer needed to accommodate for pipeline latency is variable and
      needs to take into account both the current link speed and current
      latency of the pipeline, which is time-dependent. Testing showed that
      setting the PG's size to twice the current MTU is optimal.
      
      Since PG9 is used strictly for control packets and not subject to flow
      control, we are not going to resize it according to user configuration,
      so we simply set it according to worst case scenario, which is twice the
      maximum MTU.
      
      In any case, later patches in the series will allow a user to direct
      lossless flows to other PGs than PG0 and set their size to accommodate
      for round-trip propagation delay.
      
      The above change also requires us to resize the PG buffer whenever the
      port's MTU is changed.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff6551ec
    • I
      mlxsw: spectrum: Add bytes to cells helper · 1a198449
      Ido Schimmel 提交于
      Buffers in the switch store packets in units called buffer cells. Add a
      helper to convert from bytes to cells, so that the actual number of
      cells required (result is round up) is returned.
      
      Also, drop the SB (shared buffer) acronym from the BYTES_PER_CELL macro,
      as this unit is also used in the ports' buffers and not only the
      switch's shared buffer.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a198449
    • I
      mlxsw: spectrum: Map all switch priorities to priority group 0 · dd6cb0f9
      Ido Schimmel 提交于
      During transmission, the skb's priority is used to map the skb to a
      traffic class, where the idea is to group priorities with similar
      characteristics (e.g. lossy, lossless) to the same traffic class. By
      default, all priorities are mapped to traffic class 0.
      
      In the device, we model the skb's priority as the switch priority, which
      is assigned to a packet according to its PCP value and ingress port
      (untagged packets are assigned the port's default switch priority - 0).
      
      At ingress, the packet is directed to a priority group (PG) buffer in
      the port's headroom buffer according to the packet's switch priority and
      switch priority to buffer mapping.
      
      While it's possible to configure the egress mapping between skb's
      priority (switch priority) and traffic class, there is no mechanism to
      configure the ingress mapping to a PG.
      
      In order to keep things simple and since grouping certain priorities into
      a traffic class at egress also implies they should be grouped the same
      at ingress, treat a PG as the ingress counterpart of an egress traffic
      class.
      
      Having established the above, during initialization map all the switch
      priorities to PG0 in accordance with the Linux defaults for traffic
      class mapping.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd6cb0f9
    • I
      mlxsw: reg: Add Port Prio To Buffer register · b98ff151
      Ido Schimmel 提交于
      When packets ingress the switch they are assigned a switch priority
      number that dictates the packet's priority group (PG) buffer in the
      port's headroom buffer.
      
      Add the Port Prio To Buffer (PPTB) register, which configures the switch
      priority to PG mapping.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b98ff151
  2. 06 4月, 2016 2 次提交
  3. 22 3月, 2016 5 次提交
  4. 21 3月, 2016 1 次提交
  5. 19 3月, 2016 1 次提交
  6. 18 3月, 2016 1 次提交
    • J
      mm: introduce page reference manipulation functions · fe896d18
      Joonsoo Kim 提交于
      The success of CMA allocation largely depends on the success of
      migration and key factor of it is page reference count.  Until now, page
      reference is manipulated by direct calling atomic functions so we cannot
      follow up who and where manipulate it.  Then, it is hard to find actual
      reason of CMA allocation failure.  CMA allocation should be guaranteed
      to succeed so finding offending place is really important.
      
      In this patch, call sites where page reference is manipulated are
      converted to introduced wrapper function.  This is preparation step to
      add tracepoint to each page reference manipulation function.  With this
      facility, we can easily find reason of CMA allocation failure.  There is
      no functional change in this patch.
      
      In addition, this patch also converts reference read sites.  It will
      help a second step that renames page._count to something else and
      prevents later attempt to direct access to it (Suggested by Andrew).
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe896d18
  7. 15 3月, 2016 1 次提交
    • A
      mlx4: add missing braces in verify_qp_parameters · baefd701
      Arnd Bergmann 提交于
      The implementation of QP paravirtualization back in linux-3.7 included
      some code that looks very dubious, and gcc-6 has grown smart enough
      to warn about it:
      
      drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'verify_qp_parameters':
      drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3154:5: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]
           if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
           ^~
      drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3144:4: note: ...this 'if' clause, but it is not
          if (slave != mlx4_master_func_num(dev))
      
      >From looking at the context, I'm reasonably sure that the indentation
      is correct but that it should have contained curly braces from the
      start, as the update_gid() function in the same patch correctly does.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 54679e14 ("mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      baefd701
  8. 14 3月, 2016 3 次提交
  9. 12 3月, 2016 1 次提交
  10. 11 3月, 2016 6 次提交
  11. 10 3月, 2016 2 次提交
  12. 08 3月, 2016 2 次提交
    • I
      mlxsw: pci: Correctly determine if descriptor queue is full · 5091730d
      Ido Schimmel 提交于
      The descriptor queues for sending (SDQs) and receiving (RDQs) packets
      are managed by two counters - producer and consumer - which are both
      16-bit in size. A queue is considered full when the difference between
      the two equals the queue's maximum number of descriptors.
      
      However, if the producer counter overflows, then it's possible for the
      full queue check to fail, as it doesn't take the overflow into account.
      In such a case, descriptors already passed to the device - but for which
      a completion has yet to be posted - will be overwritten, thereby causing
      undefined behavior. The above can be achieved under heavy load (~30
      netperf instances).
      
      Fix that by casting the subtraction result to u16, preventing it from
      being treated as a signed integer.
      
      Fixes: eda6500a ("mlxsw: Add PCI bus implementation")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5091730d
    • I
      mlxsw: spectrum: Always decrement bridge's ref count · 912b1c89
      Ido Schimmel 提交于
      Since we only support one VLAN filtering bridge we need to associate a
      reference count with it, so that when the last port netdev leaves it, we
      would know that a different bridge can be offloaded to hardware.
      
      When a LAG device is memeber in a bridge and port netdevs are leaving
      the LAG, we should always decrement the bridge's reference count, as it's
      incremented for any port in the LAG.
      
      Fixes: 4dc236c3 ("mlxsw: spectrum: Handle port leaving LAG while bridged")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      912b1c89
  13. 04 3月, 2016 2 次提交
    • A
      net: mellanox: add DEVLINK dependencies · 3d1cbe83
      Arnd Bergmann 提交于
      The new NET_DEVLINK infrastructure can be a loadable module, but the drivers
      using it might be built-in, which causes link errors like:
      
      drivers/net/built-in.o: In function `mlx4_load_one':
      :(.text+0x2fbfda): undefined reference to `devlink_port_register'
      :(.text+0x2fc084): undefined reference to `devlink_port_unregister'
      drivers/net/built-in.o: In function `mlxsw_sx_port_remove':
      :(.text+0x33a03a): undefined reference to `devlink_port_type_clear'
      :(.text+0x33a04e): undefined reference to `devlink_port_unregister'
      
      There are multiple ways to avoid this:
      
      a) add 'depends on NET_DEVLINK || !NET_DEVLINK' dependencies
         for each user
      b) use 'select NET_DEVLINK' from each driver that uses it
         and hide the symbol in Kconfig.
      c) make NET_DEVLINK a 'bool' option so we don't have to
         list it as a dependency, and rely on the APIs to be
         stubbed out when it is disabled
      d) use IS_REACHABLE() rather than IS_ENABLED() to check for
         NET_DEVLINK in include/net/devlink.h
      
      This implements a variation of approach a) by adding an
      intermediate symbol that drivers can depend on, and changes
      the three drivers using it.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 09d4d087 ("mlx4: Implement devlink interface")
      Fixes: c4745500 ("mlxsw: Implement devlink interface")
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d1cbe83
    • J
      net: relax setup_tc ndo op handle restriction · 5eb4dce3
      John Fastabend 提交于
      I added this check in setup_tc to multiple drivers,
      
       if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
      
      Unfortunately restricting to TC_H_ROOT like this breaks the old
      instantiation of mqprio to setup a hardware qdisc. This patch
      relaxes the test to only check the type to make it equivalent
      to the check before I broke it. With this the old instantiation
      continues to work.
      
      A good smoke test is to setup mqprio with,
      
      # tc qdisc add dev eth4 root mqprio num_tc 8 \
        map 0 1 2 3 4 5 6 7 \
        queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7
      
      Fixes: e4c6734e ("net: rework ndo tc op to consume additional qdisc handle paramete")
      Reported-by: NSingh Krishneil <krishneil.k.singh@intel.com>
      Reported-by: NJake Keller <jacob.e.keller@intel.com>
      CC: Murali Karicheri <m-karicheri2@ti.com>
      CC: Shradha Shah <sshah@solarflare.com>
      CC: Or Gerlitz <ogerlitz@mellanox.com>
      CC: Ariel Elior <ariel.elior@qlogic.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: Bruce Allan <bruce.w.allan@intel.com>
      CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
      CC: Don Skidmore <donald.c.skidmore@intel.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5eb4dce3
  14. 03 3月, 2016 6 次提交