1. 18 1月, 2017 7 次提交
    • D
      Merge branch 'mpls-packet-stats' · e60a4263
      David S. Miller 提交于
      Robert Shearman says:
      
      ====================
      mpls: Packet stats
      
      This patchset records per-interface packet stats in the MPLS
      forwarding path and exports them using a nest of attributes root at a
      new IFLA_STATS_AF_SPEC attribute as part of RTM_GETSTATS messages:
      
      [IFLA_STATS_AF_SPEC]
       -> [AF_MPLS]
        -> [MPLS_STATS_LINK]
         -> struct mpls_link_stats
      
      The first patch adds the rtnl infrastructure for this, including a new
      callbacks to per-AF ops of fill_stats_af and get_stats_af_size. The
      second patch records MPLS stats and makes use of the infrastructure to
      export them. The rtnl infrastructure could also be used to export IPv6
      stats in the future.
      
      Changes in v2:
       - make incrementing IPv6 stats in mpls_stats_inc_outucastpkts
         conditional on CONFIG_IPV6 to fix build with CONFIG_IPV6=n
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e60a4263
    • R
      mpls: Packet stats · 27d69105
      Robert Shearman 提交于
      Having MPLS packet stats is useful for observing network operation and
      for diagnosing network problems. In the absence of anything better,
      RFC2863 and RFC3813 are used for guidance for which stats to expose
      and the semantics of them. In particular rx_noroutes maps to in
      unknown protos in RFC2863. The stats are exposed to userspace via
      AF_MPLS attributes embedded in the IFLA_STATS_AF_SPEC attribute of
      RTM_GETSTATS messages.
      
      All the introduced fields are 64-bit, even error ones, to ensure no
      overflow with long uptimes. Per-CPU counters are used to avoid
      cache-line contention on the commonly used fields. The other fields
      have also been made per-CPU for code to avoid performance problems in
      error conditions on the assumption that on some platforms the cost of
      atomic operations could be more expensive than sending the packet
      (which is what would be done in the success case). If that's not the
      case, we could instead not use per-CPU counters for these fields.
      
      Only unicast and non-fragment are exposed at the moment, but other
      counters can be exposed in the future either by adding to the end of
      struct mpls_link_stats or by additional netlink attributes in the
      AF_MPLS IFLA_STATS_AF_SPEC nested attribute.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27d69105
    • R
      net: AF-specific RTM_GETSTATS attributes · aefb4d4a
      Robert Shearman 提交于
      Add the functionality for including address-family-specific per-link
      stats in RTM_GETSTATS messages. This is done through adding a new
      IFLA_STATS_AF_SPEC attribute under which address family attributes are
      nested and then the AF-specific attributes can be further nested. This
      follows the model of IFLA_AF_SPEC on RTM_*LINK messages and it has the
      advantage of presenting an easily extended hierarchy. The rtnl_af_ops
      structure is extended to provide AFs with the opportunity to fill and
      provide the size of their stats attributes.
      
      One alternative would have been to provide AFs with the ability to add
      attributes directly into the RTM_GETSTATS message without a nested
      hierarchy. I discounted this approach as it increases the rate at
      which the 32 attribute number space is used up and it makes
      implementation a little more tricky for stats dump resuming (at the
      moment the order in which attributes are added to the message has to
      match the numeric order of the attributes).
      
      Another alternative would have been to register per-AF RTM_GETSTATS
      handlers. I discounted this approach as I perceived a common use-case
      to be getting all the stats for an interface and this approach would
      necessitate multiple requests/dumps to retrieve them all.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aefb4d4a
    • P
      net: marvell: sky2: use new api ethtool_{get|set}_link_ksettings · 55f78fcd
      Philippe Reynes 提交于
      The ethtool api {get|set}_settings is deprecated.
      We move this driver to new api {get|set}_link_ksettings.
      Signed-off-by: NPhilippe Reynes <tremyfr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55f78fcd
    • P
      net: marvell: skge: use new api ethtool_{get|set}_link_ksettings · 0f826385
      Philippe Reynes 提交于
      The ethtool api {get|set}_settings is deprecated.
      We move this driver to new api {get|set}_link_ksettings.
      
      The callback set_link_ksettings no longer update the value
      of advertising, as the struct ethtool_link_ksettings is
      defined as const.
      
      As I don't have the hardware, I'd be very pleased if
      someone may test this patch.
      Signed-off-by: NPhilippe Reynes <tremyfr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f826385
    • P
      net: jme: use new api ethtool_{get|set}_link_ksettings · c523838c
      Philippe Reynes 提交于
      The ethtool api {get|set}_settings is deprecated.
      We move this driver to new api {get|set}_link_ksettings.
      
      As I don't have the hardware, I'd be very pleased if
      someone may test this patch.
      Signed-off-by: NPhilippe Reynes <tremyfr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c523838c
    • P
      net: korina: use new api ethtool_{get|set}_link_ksettings · af473688
      Philippe Reynes 提交于
      The ethtool api {get|set}_settings is deprecated.
      We move this driver to new api {get|set}_link_ksettings.
      Signed-off-by: NPhilippe Reynes <tremyfr@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af473688
  2. 17 1月, 2017 16 次提交
  3. 16 1月, 2017 2 次提交
  4. 15 1月, 2017 2 次提交
    • D
      Merge tag 'mac80211-next-for-davem-2017-01-13' of... · bb60b8b3
      David S. Miller 提交于
      Merge tag 'mac80211-next-for-davem-2017-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
      
      Johannes Berg says:
      
      ====================
      For 4.11, we seem to have more than in the past few releases:
       * socket owner support for connections, so when the wifi
         manager (e.g. wpa_supplicant) is killed, connections are
         torn down - wpa_supplicant is critical to managing certain
         operations, and can opt in to this where applicable
       * minstrel & minstrel_ht updates to be more efficient (time and space)
       * set wifi_acked/wifi_acked_valid for skb->destructor use in the
         kernel, which was already available to userspace
       * don't indicate new mesh peers that might be used if there's no
         room to add them
       * multicast-to-unicast support in mac80211, for better medium usage
         (since unicast frames can use *much* higher rates, by ~3 orders of
         magnitude)
       * add API to read channel (frequency) limitations from DT
       * add infrastructure to allow randomizing public action frames for
         MAC address privacy (still requires driver support)
       * many cleanups and small improvements/fixes across the board
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb60b8b3
    • S
      cxgb4: Remove redundant memset before memcpy · ca4b5eb8
      Shyam Saini 提交于
      The region set by the call to memset, immediately overwritten by
      the subsequent call to memcpy and thus makes the  memset redundant.
      
      Also remove the memset((&info, 0, sizeof(info)) on line 398 because
      info is memcpy()'ed to before being used in the loop and it isn't
      used outside of the loop.
      Signed-off-by: NShyam Saini <mayhs11saini@gmail.com>
      Reviewed-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca4b5eb8
  5. 14 1月, 2017 13 次提交
    • G
      cxgb4: Fix misleading packet/frame count stats. · f750e82e
      Ganesh Goudar 提交于
      Do not count pause frames as part of general TX/RX frame
      counters.
      
      Based on the original work of Casey Leedom <leedom@chelsio.com>
      Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f750e82e
    • D
      Merge branch 'bnxt_en-next' · 4b89aa3c
      David S. Miller 提交于
      Michael Chan says:
      
      ====================
      bnxt_en: Misc. updates for net-next.
      
      Miscellaneous updates including firmware spec update, ethtool -p blinking
      LED support, RDMA SRIOV config callback, and minor fixes.
      
      v2: Dropped the DCBX RoCE app TLV patch until the ETH_P_IBOE RDMA patch
      is merged.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b89aa3c
    • M
      bnxt_en: Add the ulp_sriov_cfg hooks for bnxt_re RDMA driver. · 2f593846
      Michael Chan 提交于
      Add the ulp_sriov_cfg callbacks when the number of VFs is changing.  This
      allows the RDMA driver to provision RDMA resources for the VFs.
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f593846
    • M
      bnxt_en: Add support for ethtool -p. · 5ad2cbee
      Michael Chan 提交于
      Add LED blinking code to support ethtool -p on the PF.
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ad2cbee
    • M
    • M
      bnxt_en: Clear TPA flags when BNXT_FLAG_NO_AGG_RINGS is set. · 341138c3
      Michael Chan 提交于
      Commit bdbd1eb5 ("bnxt_en: Handle no aggregation ring gracefully.")
      introduced the BNXT_FLAG_NO_AGG_RINGS flag.  For consistency,
      bnxt_set_tpa_flags() should also clear TPA flags when there are no
      aggregation rings.
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      341138c3
    • M
      bnxt_en: Fix compiler warnings when CONFIG_RFS_ACCEL is not defined. · b7429954
      Michael Chan 提交于
      CC [M]  drivers/net/ethernet/broadcom/bnxt/bnxt.o
      drivers/net/ethernet/broadcom/bnxt/bnxt.c:4947:21: warning: ‘bnxt_get_max_func_rss_ctxs’ defined but not used [-Wunused-function]
       static unsigned int bnxt_get_max_func_rss_ctxs(struct bnxt *bp)
                           ^
        CC [M]  drivers/net/ethernet/broadcom/bnxt/bnxt.o
      drivers/net/ethernet/broadcom/bnxt/bnxt.c:4956:21: warning: ‘bnxt_get_max_func_vnics’ defined but not used [-Wunused-function]
       static unsigned int bnxt_get_max_func_vnics(struct bnxt *bp)
                           ^
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7429954
    • D
      Merge branch 'tcp-RACK-fast-recovery' · 718e14bb
      David S. Miller 提交于
      Yuchung Cheng says:
      
      ====================
      tcp: RACK fast recovery
      
      The patch set enables RACK loss detection (draft-ietf-tcpm-rack-01)
      to trigger fast recovery with a reordering timer.
      
      Previously RACK has been running in auxiliary mode where it is
      used to detect packet losses once the recovery has triggered by
      other algorithms (e.g., FACK). By inspecting packet timestamps,
      RACK can start ACK-driven repairs timely. A few similar heuristics
      are no longer needed and are either removed or disabled to reduce
      the complexity of the Linux TCP loss recovery engine:
      
        1. FACK (Forward Acknowledgement)
        2. Early Retransmit (RFC5827)
        3. thin_dupack (fast recovery on single DUPACK for thin-streams)
        4. NCR (Non-Congestion Robustness RFC4653) (RFC4653)
        5. Forward Retransmit
      
      After this change, Linux's loss recovery algorithms consist of
        1. Conventional DUPACK threshold approach (RFC6675)
        2. RACK and Tail Loss Probe (draft-ietf-tcpm-rack-01)
        3. RTO plus F-RTO extension (RFC5682)
      
      The patch set has been tested on Google servers extensively and
      presented in several IETF meetings. The data suggests that RACK
      successfully improves recovery performance:
      https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-draft-ietf-tcpm-rack-01.pdf
      https://www.ietf.org/proceedings/96/slides/slides-96-tcpm-3.pdf
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      718e14bb
    • Y
      tcp: disable fack by default · 94bdc978
      Yuchung Cheng 提交于
      This patch disables FACK by default as RACK is the successor of FACK
      (inspired by the insights behind FACK).
      
      FACK[1] in Linux works as follows: a packet P is deemed lost,
      if packet Q of higher sequence is s/acked and P and Q are distant
      by at least dupthresh number of packets in sequence space.
      
      FACK is more aggressive than the IETF recommened recovery for SACK
      (RFC3517 A Conservative Selective Acknowledgment (SACK)-based Loss
       Recovery Algorithm for TCP), because a single SACK may trigger
      fast recovery. This obviously won't work well with reordering so
      FACK is dynamically disabled upon detecting reordering.
      
      RACK supersedes FACK by using time distance instead of sequence
      distance. On reordering, RACK waits for a quarter of RTT receiving
      a single SACK before starting recovery. (the timer can be made more
      adaptive in the future by measuring reordering distance in time,
      but currently RTT/4 seem to work well.) Once the recovery starts,
      RACK behaves almost like FACK because it reduces the reodering
      window to 1ms, so it fast retransmits quickly. In addition RACK
      can detect loss retransmission as it does not care about the packet
      sequences (being repeated or not), which is extremely useful when
      the connection is going through a traffic policer.
      
      Google server experiments indicate that disabling FACK after enabling
      RACK has negligible impact on the overall loss recovery performance
      with more reordering events detected.  But we still keep the FACK
      implementation for backup if RACK has bugs that needs to be disabled.
      
      [1] M. Mathis, J. Mahdavi, "Forward Acknowledgment: Refining
      TCP Congestion Control," In Proceedings of SIGCOMM '96, August 1996.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94bdc978
    • Y
      tcp: remove thin_dupack feature · 4a7f6009
      Yuchung Cheng 提交于
      Thin stream DUPACK is to start fast recovery on only one DUPACK
      provided the connection is a thin stream (i.e., low inflight).  But
      this older feature is now subsumed with RACK. If a connection
      receives only a single DUPACK, RACK would arm a reordering timer
      and soon starts fast recovery instead of timeout if no further
      ACKs are received.
      
      The socket option (THIN_DUPACK) is kept as a nop for compatibility.
      Note that this patch does not change another thin-stream feature
      which enables linear RTO. Although it might be good to generalize
      that in the future (i.e., linear RTO for the first say 3 retries).
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a7f6009
    • Y
      tcp: remove RFC4653 NCR · ac229dca
      Yuchung Cheng 提交于
      This patch removes the (partial) implementation of the aggressive
      limited transmit in RFC4653 TCP Non-Congestion Robustness (NCR).
      
      NCR is a mitigation to the problem created by the dynamic
      DUPACK threshold.  With the current adaptive DUPACK threshold
      (tp->reordering) could cause timeouts by preventing fast recovery.
      For example, if the last packet of a cwnd burst was reordered, the
      threshold will be set to the size of cwnd. But if next application
      burst is smaller than threshold and has drops instead of reorderings,
      the sender would not trigger fast recovery but instead resorts to a
      timeout recovery.
      
      NCR mitigates this issue by checking the number of DUPACKs against
      the current flight size additionally. The techniqueue is similar to
      the early retransmit RFC.
      
      With RACK loss detection, this mitigation is not needed, because RACK
      does not use DUPACK threshold to detect losses. RACK arms a reordering
      timer to fire at most a quarter RTT later to start fast recovery.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac229dca
    • Y
      tcp: remove early retransmit · bec41a11
      Yuchung Cheng 提交于
      This patch removes the support of RFC5827 early retransmit (i.e.,
      fast recovery on small inflight with <3 dupacks) because it is
      subsumed by the new RACK loss detection. More specifically when
      RACK receives DUPACKs, it'll arm a reordering timer to start fast
      recovery after a quarter of (min)RTT, hence it covers the early
      retransmit except RACK does not limit itself to specific inflight
      or dupack numbers.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bec41a11
    • Y
      tcp: remove forward retransmit feature · 840a3cbe
      Yuchung Cheng 提交于
      Forward retransmit is an esoteric feature in RFC3517 (condition(3)
      in the NextSeg()). Basically if a packet is not considered lost by
      the current criteria (# of dupacks etc), but the congestion window
      has room for more packets, then retransmit this packet.
      
      However it actually conflicts with the rest of recovery design. For
      example, when reordering is detected we want to be conservative
      in retransmitting packets but forward-retransmit feature would
      break that to force more retransmission. Also the implementation is
      fairly complicated inside the retransmission logic inducing extra
      iterations in the write queue. With RACK losses are being detected
      timely and this heuristic is no longer necessary. There this patch
      removes the feature.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      840a3cbe