1. 22 5月, 2017 16 次提交
    • M
      net: add new control message for incoming HW-timestamped packets · aad9c8c4
      Miroslav Lichvar 提交于
      Add SOF_TIMESTAMPING_OPT_PKTINFO option to request a new control message
      for incoming packets with hardware timestamps. It contains the index of
      the real interface which received the packet and the length of the
      packet at layer 2.
      
      The index is useful with bonding, bridges and other interfaces, where
      IP_PKTINFO doesn't allow applications to determine which PHC made the
      timestamp. With the L2 length (and link speed) it is possible to
      transpose preamble timestamps to trailer timestamps, which are used in
      the NTP protocol.
      
      While this information could be provided by two new socket options
      independently from timestamping, it doesn't look like they would be very
      useful. With this option any performance impact is limited to hardware
      timestamping.
      
      Use dev_get_by_napi_id() to get the device and its index. On kernels
      with disabled CONFIG_NET_RX_BUSY_POLL or drivers not using NAPI, a zero
      index will be returned in the control message.
      
      CC: Richard Cochran <richardcochran@gmail.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aad9c8c4
    • M
      net: add function to retrieve original skb device using NAPI ID · 90b602f8
      Miroslav Lichvar 提交于
      Since commit b6858177 ("net: Make skb->skb_iif always track
      skb->dev") skbs don't have the original index of the interface which
      received the packet. This information is now needed for a new control
      message related to hardware timestamping.
      
      Instead of adding a new field to skb, we can find the device by the NAPI
      ID if it is available, i.e. CONFIG_NET_RX_BUSY_POLL is enabled and the
      driver is using NAPI. Add dev_get_by_napi_id() and also skb_napi_id() to
      hide the CONFIG_NET_RX_BUSY_POLL ifdef.
      
      CC: Richard Cochran <richardcochran@gmail.com>
      Suggested-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90b602f8
    • M
      net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL · e3412575
      Miroslav Lichvar 提交于
      Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid
      filter and update drivers which can timestamp all packets, or which
      explicitly list unsupported filters instead of using a default case, to
      handle the filter.
      
      CC: Richard Cochran <richardcochran@gmail.com>
      CC: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3412575
    • M
      net: define receive timestamp filter for NTP · b8210a9e
      Miroslav Lichvar 提交于
      Add HWTSTAMP_FILTER_NTP_ALL to the hwtstamp_rx_filters enum for
      timestamping of NTP packets. There is currently only one driver
      (phyter) that could support it directly.
      
      CC: Richard Cochran <richardcochran@gmail.com>
      CC: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8210a9e
    • G
      cxgb4 : retrieve port information from firmware · 2061ec3f
      Ganesh Goudar 提交于
      issue get port information command to firmware to retrieve port
      information and update if it is different from what was last
      recorded and also add indication for supported link modes for
      firmware port types FW_PORT_TYPE_SFP28, FW_PORT_TYPE_KR_SFP28,
      FW_PORT_TYPE_CR4_QSFP.
      
      Based on the original work by Casey Leedom <leedom@chelsio.com>
      Signed-off-by: NCasey Leedom <leedom@chelsio.com>
      Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2061ec3f
    • S
      ibmveth: Support to enable LSO/CSO for Trunk VEA. · 66aa0678
      Sivakumar Krishnasamy 提交于
      Current largesend and checksum offload feature in ibmveth driver,
       - Source VM sends the TCP packets with ip_summed field set as
         CHECKSUM_PARTIAL and TCP pseudo header checksum is placed in
         checksum field
       - CHECKSUM_PARTIAL flag in SKB will enable ibmveth driver to mark
         "no checksum" and "checksum good" bits in transmit buffer descriptor
         before the packet is delivered to pseries PowerVM Hypervisor
       - If ibmveth has largesend capability enabled, transmit buffer descriptors
         are market accordingly before packet is delivered to Hypervisor
         (along with mss value for packets with length > MSS)
       - Destination VM's ibmveth driver receives the packet with "checksum good"
         bit set and so, SKB's ip_summed field is set with CHECKSUM_UNNECESSARY
       - If "largesend" bit was on, mss value is copied from receive descriptor
         into SKB's gso_size and other flags are appropriately set for
         packets > MSS size
       - The packet is now successfully delivered up the stack in destination VM
      
      The offloads described above works fine for TCP communication among VMs in
      the same pseries server ( VM A <=> PowerVM Hypervisor <=> VM B )
      
      We are now enabling support for OVS in pseries PowerVM environment. One of
      our requirements is to have ibmveth driver configured in "Trunk" mode, when
      they are used with OVS. This is because, PowerVM Hypervisor will no more
      bridge the packets between VMs, instead the packets are delivered to
      IO Server which hosts OVS to bridge them between VMs or to external
      networks (flow shown below),
        VM A <=> PowerVM Hypervisor <=> IO Server(OVS) <=> PowerVM Hypervisor
                                                                         <=> VM B
      In "IO server" the packet is received by inbound Trunk ibmveth and then
      delivered to OVS, which is then bridged to outbound Trunk ibmveth (shown
      below),
              Inbound Trunk ibmveth <=> OVS <=> Outbound Trunk ibmveth
      
      In this model, we hit the following issues which impacted the VM
      communication performance,
      
       - Issue 1: ibmveth doesn't support largesend and checksum offload features
         when configured as "Trunk". Driver has explicit checks to prevent
         enabling these offloads.
      
       - Issue 2: SYN packet drops seen at destination VM. When the packet
         originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to
         IO server's inbound Trunk ibmveth, on validating "checksum good" bits
         in ibmveth receive routine, SKB's ip_summed field is set with
         CHECKSUM_UNNECESSARY flag. This packet is then bridged by OVS (or Linux
         Bridge) and delivered to outbound Trunk ibmveth. At this point the
         outbound ibmveth transmit routine will not set "no checksum" and
         "checksum good" bits in transmit buffer descriptor, as it does so only
         when the ip_summed field is CHECKSUM_PARTIAL. When this packet gets
         delivered to destination VM, TCP layer receives the packet with checksum
         value of 0 and with no checksum related flags in ip_summed field. This
         leads to packet drops. So, TCP connections never goes through fine.
      
       - Issue 3: First packet of a TCP connection will be dropped, if there is
         no OVS flow cached in datapath. OVS while trying to identify the flow,
         computes the checksum. The computed checksum will be invalid at the
         receiving end, as ibmveth transmit routine zeroes out the pseudo
         checksum value in the packet. This leads to packet drop.
      
       - Issue 4: ibmveth driver doesn't have support for SKB's with frag_list.
         When Physical NIC has GRO enabled and when OVS bridges these packets,
         OVS vport send code will end up calling dev_queue_xmit, which in turn
         calls validate_xmit_skb.
         In validate_xmit_skb routine, the larger packets will get segmented into
         MSS sized segments, if SKB has a frag_list and if the driver to which
         they are delivered to doesn't support NETIF_F_FRAGLIST feature.
      
      This patch addresses the above four issues, thereby enabling end to end
      largesend and checksum offload support for better performance.
      
       - Fix for Issue 1 : Remove checks which prevent enabling TCP largesend and
         checksum offloads.
       - Fix for Issue 2 : When ibmveth receives a packet with "checksum good"
         bit set and if its configured in Trunk mode, set appropriate SKB fields
         using skb_partial_csum_set (ip_summed field is set with
         CHECKSUM_PARTIAL)
       - Fix for Issue 3: Recompute the pseudo header checksum before sending the
         SKB up the stack.
       - Fix for Issue 4: Linearize the SKBs with frag_list. Though we end up
         allocating buffers and copying data, this fix gives
         upto 4X throughput increase.
      
      Note: All these fixes need to be dropped together as fixing just one of
      them will lead to other issues immediately (especially for Issues 1,2 & 3).
      Signed-off-by: NSivakumar Krishnasamy <ksiva@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66aa0678
    • D
      Merge branch 'qed-next' · 76e7f31d
      David S. Miller 提交于
      Yuval Mintz says:
      
      ====================
      qed/qede updates
      
      This series contains some general minor fixes and enhancements:
      
       - #1, #2 and #9 correct small missing ethtool functionality.
       - #3, #6  and #8 correct minor issues in driver, but those are either
         print-related or unexposed in existing code.
       - #4 adds proper support to TLB mode bonding.
       - #10 is meant to improve performance on varying cache-line sizes.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76e7f31d
    • S
      qede: Support 1G advertisment. · 9ac4c546
      Sudarsana Reddy Kalluru 提交于
      Some variants of adapters support the 1G speed capability. Need to
      allow the configuration of 1G speed if adapter supports it.
      Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ac4c546
    • T
      qed: Fix setting of Management bitfields · b19601bb
      Tomer Tayar 提交于
      The management firmware HSI contains masks which are already
      shifted to their right place, so QED_MFW_SET_FIELD() is clearing
      incorrect fields by shifting the mask by the offset.
      
      Luckily, today we set the fields in an incrementing order [so we're
      not erasing any previously set fields], but this still needs fixing.
      Signed-off-by: NTomer Tayar <Tomer.Tayar@cavium.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b19601bb
    • M
      qede: qedr closure after setting state · 2e7022d6
      Mintz, Yuval 提交于
      This is benign, but it makes more sense to start the close sequence
      only after changing the internal state [in case it would once care].
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e7022d6
    • M
      qed: Correct print in iscsi error-flow · 88fa9527
      Mintz, Yuval 提交于
      If too many CQs are requested, qed would print the available
      number as if it's a resource and not a feature leading to the
      wrong print.
      
      Fixes: 08737a3f ("qed: Inform qedi the number of possible CQs")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88fa9527
    • T
      qed: Revise alloc/setup/free flow · 3587cb87
      Tomer Tayar 提交于
      Re-organize the logic that allocates and frees memory of various
      sub-components of the hw-function -
      
       a. No need to pass pointers to said structure as parameters;
          The internal logic knows exactly where to find/set the data.
      
       b. Nullify pointers after cleanup to prevent possible errors to
          re-entrant code.
      Signed-off-by: NTomer Tayar <Tomer.Tayar@cavium.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3587cb87
    • M
      qede: Don't use an internal MAC field · 492a1d98
      Mintz, Yuval 提交于
      Driver maintains its primary MAC in a private field which
      gets updated when ndo_dev_set_mac() gets called.
      
      However, there are flows where the primary MAC of the device can change
      without said NDO being called [bond device in TLB mode configuring
      slaves' addresses], resulting in a configuration where there's a mismatch
      between what's apparent to user [the netdevice's value] and what's
      configured in the HW [the private value].
      
      As we don't have any real motivation of maintaining this
      private field, simply remove it and start using the netdevice's
      field instead.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      492a1d98
    • S
      qede: Add missing Status-block free · 71851ea5
      Sudarsana Reddy Kalluru 提交于
      When destroying the datapath channels, qede doesn't notify qed of the
      released status blocks which were acquired during the initialization.
      Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71851ea5
    • S
      qede: Honor user request for Tx buffers · 5a052d62
      Sudarsana Reddy Kalluru 提交于
      Driver always allocates the maximal number of tx-buffers irrespective of
      actual Tx ring config.
      Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a052d62
    • M
      qede: Allow WoL to activate by default · ba798b5b
      Mintz, Yuval 提交于
      When management firmware declares that the device is WoL-capable,
      the default driver behavior would be to allow the management firmware
      to take the decision of whether it's actually needed or not.
      
      Problem is ethtool interface doesn't have a 'default' kind
      of option, and user would see the interface WoL as disabled,
      which doesn't accurately reflect the actual configuration.
      More-so, if the user actually wants to explicitly disable WoL he'd have
      to first enable it [otherwise ethtool would block the command].
      
      Instead of allowing management to make the decision, enable WoL by
      default on all devices capable of it.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba798b5b
  2. 20 5月, 2017 12 次提交
  3. 19 5月, 2017 12 次提交