1. 23 7月, 2015 10 次提交
  2. 22 7月, 2015 30 次提交
    • R
      net: track success and failure of TCP PMTU probing · b56ea298
      Rick Jones 提交于
      Track success and failure of TCP PMTU probing.
      Signed-off-by: NRick Jones <rick.jones2@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b56ea298
    • H
      cxgb4: Add debugfs entry to enable backdoor access · 0b2c2a93
      Hariprasad Shenai 提交于
      Add debugfs entry 'use_backdoor' to enable backdoor access to read sge
      context. By default, we read sge context's via firmware. In case of FW
      issues, one can enable backdoor access via debugfs to dump sge context
      for debugging purpose.
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b2c2a93
    • R
      mpls: make RTA_OIF optional · 01faef2c
      Roopa Prabhu 提交于
      If user did not specify an oif, try and get it from the via address.
      If failed to get device, return with -ENODEV.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01faef2c
    • D
      Merge branch 'sfc-filter-chaining' · fd36ef60
      David S. Miller 提交于
      Edward Cree says:
      
      ====================
      sfc: support for cascaded multicast filtering
      
      Recent versions of firmware for SFC9100 adapters add support for filter
       chaining, in which packets matching multiple filters are delivered to all
       filters' recipients, rather than only the highest match-priority filter as was
       previously the case.
      This patch series enables this feature and redesigns the filter handling code
       to make use of it; in particular, subscribing to a multicast address on one
       function no longer prevents traffic to that address reaching another function
       which is in promiscuous or allmulti mode.
      If the firmware does not support filter chaining, the driver will fall back to
       the old behaviour.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd36ef60
    • E
      sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode · 12fb0da4
      Edward Cree 提交于
      Separate functions for inserting individual and promisc filters; explicit
       fallback logic in efx_ef10_filter_sync_rx_mode(), in order not to overload
       the 'promisc' flag as also meaning "fall back to promisc".
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12fb0da4
    • D
      sfc: support cascaded multicast filters · ab8b1f7c
      Daniel Pieczko 提交于
      If the workaround to support cascaded multicast filters ("workaround_26807") is
      enabled, the broadcast filter and individual multicast filters are not inserted
      when in promiscuous or allmulti mode.
      
      There is a race while inserting and removing filters when entering and leaving
      promiscuous mode.  When changing promiscuous state with cascaded multicast
      filters, the old multicast filters are removed before inserting the new filters
      to avoid duplicating packets; this can lead to dropped packets until all
      filters have been inserted.
      
      The efx_nic:mc_promisc flag is added to record the presence of a multicast
      promiscuous filter; this gives a simple way to tell if the promiscuous state is
      changing.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab8b1f7c
    • D
      sfc: re-factor efx_ef10_filter_sync_rx_mode() · 822b96f8
      Daniel Pieczko 提交于
      This change is only re-factoring; there are no changes to functionality
       except for a slight elaboration of an error message (on mismatch filter
       insertion failure).
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      822b96f8
    • J
      sfc: Insert multicast filters as well as mismatch filters in promiscuous mode · b6f568e2
      Jon Cooper 提交于
      If a function is in promiscuous mode and another function has a broadcast or
       multicast filter inserted, the function in promiscuous mode won't see that
       broadcast or multicast traffic.
      Most notably this breaks broadcast, which means ARP doesn't work. Less
       show-stoppingly, a function listening on a multicast address that's also in
       promiscuous mode will not see that multicast traffic if another function is
       also listening on that multicast address.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6f568e2
    • D
      sfc: warn if other functions have been reset by MCFW · 5a55a72a
      Daniel Pieczko 提交于
      When enabling the workaround for cascaded multicast filters, the MC
       can reset other functions if they have already inserted filters.
       In that case, the workaround has been enabled, but print an info
       message in the log recording that other functions had to be reset.
      
      As other functions were reset, the MC will have incremented its boot
       count, so also increment the warm_boot_count on the function which
       enabled the workaround, as that function won't have received an MC
       reboot event and does not need to reset.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a55a72a
    • D
      sfc: add output flag decoding to efx_mcdi_set_workaround · 34ccfe6f
      Daniel Pieczko 提交于
      The initial use of this will be to check a flag reporting if an FLR was
      performed on other functions when enabling cascaded multicast filters.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34ccfe6f
    • E
      sfc: cope with ENOSYS from efx_mcdi_get_workarounds() · 832dc9ed
      Edward Cree 提交于
      GET_WORKAROUNDS was only introduced in May 2014, not all firmware
       will have it.  So call sites need to handle ENOSYS.
      In this case we're probing the bug26807 workaround, which is not
       implemented in any firmware that doesn't have GET_WORKAROUNDS.
       So interpret ENOSYS as 'false'.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      832dc9ed
    • D
      sfc: enable cascaded multicast filters in MCFW · 46e612b0
      Daniel Pieczko 提交于
      After creating event queue 0, check to see if the workaround is enabled,
       and enable it if necessary.  This will be called during PCI probe and
       also when coming back up after a reset.  The nic_data->workaround_26807
       will be used in the future to control the filter insertion behaviour
       based on this workaround.
      
      Only the primary PF can enable this workaround, so tolerate an EPERM
       error and continue.  Otherwise, if any step in the checking and enabling
       of the workaround fails, the event queue must be removed.
      
      We check that workaround is implemented before trying to enable it,
       and store the current workaround setting before trying to change it.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46e612b0
    • E
      sfc: update MCDI protocol definitions · a9196bb0
      Edward Cree 提交于
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9196bb0
    • J
      tipc: fix compatibility bug · 16040894
      Jon Paul Maloy 提交于
      In commit d999297c
      ("tipc: reduce locking scope during packet reception") we introduced
      a new function tipc_link_proto_rcv(). This function contains a bug,
      so that it sometimes by error sends out a non-zero link priority value
      in created protocol messages.
      
      The bug may lead to an extra link reset at initial link establising
      with older nodes. This will never happen more than once, whereafter
      the link will work as intended.
      
      We fix this bug in this commit.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16040894
    • D
      Merge branch 'explicit-inbound-link-state' · 67b2914b
      David S. Miller 提交于
      Florian Fainelli says:
      
      ====================
      net: enable inband link state negotiation only when explicitly requested
      
      Changes in v5:
      
      - removed an invalid use of the link_update callback in the SF2 driver
        was appeared after merging "net: phy: fixed_phy: handle link-down case"
      
      - reworded the commit message for patch 2 to make it clear what it fixes and
        why this is required
      
      Initial cover letter from Stas:
      
      Hello.
      
      Currently the link status auto-negotiation is enabled
      for any SGMII link with fixed-link DT binding.
      The regression was reported:
      https://lkml.org/lkml/2015/7/8/865
      Apparently not all HW that implements SGMII protocol, generates the
      inband status for the auto-negotiation to work.
      More details here:
      https://lkml.org/lkml/2015/7/10/206
      
      The following patches reverts to the old behavior by default,
      which is to not enable the auto-negotiation for fixed-link.
      The new DT property is added that allows to explicitly request
      the auto-negotiation.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67b2914b
    • S
      mvneta: use inband status only when explicitly enabled · f8af8e6e
      Stas Sergeev 提交于
      The commit 898b2970 ("mvneta: implement SGMII-based in-band link state
      signaling") implemented the link parameters auto-negotiation unconditionally.
      Unfortunately it appears that some HW that implements SGMII protocol,
      doesn't generate the inband status, so it is not possible to auto-negotiate
      anything with such HW.
      
      This patch enables the auto-negotiation only if explicitly requested with
      the 'managed' DT property.
      
      This patch fixes the following regression:
      https://lkml.org/lkml/2015/7/8/865Signed-off-by: NStas Sergeev <stsp@users.sourceforge.net>
      
      CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      CC: netdev@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8af8e6e
    • S
      of_mdio: add new DT property 'managed' to specify the PHY management type · 4cba5c21
      Stas Sergeev 提交于
      Currently the PHY management type is selected by the MAC driver arbitrary.
      The decision is based on the presence of the "fixed-link" node and on a
      will of the driver's authors.
      This caused a regression recently, when mvneta driver suddenly started
      to use the in-band status for auto-negotiation on fixed links.
      It appears the auto-negotiation may not work when expected by the MAC driver.
      Sebastien Rannou explains:
      << Yes, I confirm that my HW does not generate an in-band status. AFAIK, it's
      a PHY that aggregates 4xSGMIIs to 1xQSGMII ; the MAC side of the PHY (with
      inband status) is connected to the switch through QSGMII, and in this context
      we are on the media side of the PHY. >>
      https://lkml.org/lkml/2015/7/10/206
      
      This patch introduces the new string property 'managed' that allows
      the user to set the management type explicitly.
      The supported values are:
      "auto" - default. Uses either MDIO or nothing, depending on the presence
      of the fixed-link node
      "in-band-status" - use in-band status
      Signed-off-by: NStas Sergeev <stsp@users.sourceforge.net>
      
      CC: Rob Herring <robh+dt@kernel.org>
      CC: Pawel Moll <pawel.moll@arm.com>
      CC: Mark Rutland <mark.rutland@arm.com>
      CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
      CC: Kumar Gala <galak@codeaurora.org>
      CC: Florian Fainelli <f.fainelli@gmail.com>
      CC: Grant Likely <grant.likely@linaro.org>
      CC: devicetree@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      CC: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cba5c21
    • S
      net: phy: fixed_phy: handle link-down case · 868a4215
      Stas Sergeev 提交于
      fixed_phy_register() currently hardcodes the fixed PHY link to 1, and
      expects to find a "speed" parameter to provide correct information
      towards the fixed PHY consumer.
      
      In a subsequent change, where we allow "managed" (e.g: (RS)GMII in-band
      status auto-negotiation) fixed PHYs, none of these parameters can be
      provided since they will be auto-negotiated, hence, we just provide a
      zero-initialized fixed_phy_status to fixed_phy_register() which makes it
      fail when we call fixed_phy_update_regs() since status.speed = 0 which
      makes us hit the "default" label and error out.
      
      Without this change, we would also see potentially inconsistent
      speed/duplex parameters for fixed PHYs when the link is DOWN.
      
      CC: netdev@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: NStas Sergeev <stsp@users.sourceforge.net>
      [florian: add more background to why this is correct and desirable]
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      868a4215
    • F
      net: dsa: bcm_sf2: Do not override speed settings · d2eac98f
      Florian Fainelli 提交于
      The SF2 driver currently overrides speed settings for its port
      configured using a fixed PHY, this is both unnecessary and incorrect,
      because we keep feedback to the hardware parameters that we read from
      the PHY device, which in the case of a fixed PHY cannot possibly change
      speed.
      
      This is a required change to allow the fixed PHY code to allow
      registering a PHY with a link configured as DOWN by default and avoid
      some sort of circular dependency where we require the link_update
      callback to run to program the hardware, and we then utilize the fixed
      PHY parameters to program the hardware with the same settings.
      
      Fixes: 246d7f77 ("net: dsa: add Broadcom SF2 switch driver")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2eac98f
    • M
      net: #ifdefify sk_classid member of struct sock · e181a543
      Mathias Krause 提交于
      The sk_classid member is only required when CONFIG_CGROUP_NET_CLASSID is
      enabled. #ifdefify it to reduce the size of struct sock on 32 bit
      systems, at least.
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e181a543
    • D
      Merge branch 'lwtunnel' · e69724f3
      David S. Miller 提交于
      Thomas Graf says:
      
      ====================
      Lightweight & flow based encapsulation
      
      This series combines the work previously posted by Roopa, Robert and
      myself. It's according to what we discussed at NFWS. The motivation
      of this series is to:
      
       * Consolidate code between OVS and the rest of the kernel and get
         rid of OVS vports and instead represent them as pure net_devices.
       * Introduce a lightweight tunneling mechanism which enables flow
         based encapsulation to improve scalability on both RX and TX.
       * Do the above in an encapsulation unspecific way so that the
         encapsulation type is eventually abstracted away from the user.
       * Use the same forwarding decision for both native forwarding and
         encapsulation thus allowing to switch between native IPv6 and
         UDP encapsulation based on endpoint without requiring additional
         logic
      
      The fundamental changes introduces in this series are:
       * A new RTA_ENCAP Netlink attribute for routes carrying encapsulation
         instructions. Depending on the specified type, the instructions
         apply to UDP encapsulations, MPLS and possible other in the future.
       * Depending on the encapsulation type, the output function of the
         dst is directly overwritten or the dst merely attaches metadata and
         relies on a subsequent net_device to apply it to the packet. The
         latter is typically used if an inner and outer IP header exist which
         require two subsequent routing lookups to be performed.
       * A new metadata_dst structure which can be attached to skbs to
         carry metadata in between subsystems. This new metadata transport
         is used to provide a single interface for VXLAN, routing and OVS
         to communicate through metadata.
      
      The OVS interfaces remain as-is but will transparently create a real
      VXLAN net_device in the background. iproute2 is extended with a new
      use cases:
      
        VXLAN:
        ip route add 40.1.1.1/32 encap vxlan id 10 dst 50.1.1.2 dev vxlan0
      
        MPLS:
        ip route add 10.1.1.0/30 encap mpls 200 via inet 10.1.1.1 dev swp1
      
      Performance implications:
        The additional memory allocation in the receive path should have
        performance implications although it is not observable in standard
        throughput tests if GRO is properly done. The correct net_device
        model outweights the additional cost of the allocation. Furthermore,
        this implication can be relaxed by reintroducing a direct unqueued
        path from a software device to a consumer like bridge or OVS if
        needed.
      
          $ netperf  -t TCP_STREAM -H 15.1.1.201
          MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
          15.1.1.201 (15.1.1.201) port 0 AF_INET : demo
          Recv   Send    Send
          Socket Socket  Message  Elapsed
          Size   Size    Size     Time     Throughput
          bytes  bytes   bytes    secs.    10^6bits/sec
      
           87380  16384  16384    10.00    9118.17
      
      Changes since v1:
       * Properly initialize tun_id as reported by Julian
       * Drop dupliate netif_keep_dst() as reported by Alexei
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e69724f3
    • T
      openvswitch: Use regular VXLAN net_device device · 614732ea
      Thomas Graf 提交于
      This gets rid of all OVS specific VXLAN code in the receive and
      transmit path by using a VXLAN net_device to represent the vport.
      Only a small shim layer remains which takes care of handling the
      VXLAN specific OVS Netlink configuration.
      
      Unexports vxlan_sock_add(), vxlan_sock_release(), vxlan_xmit_skb()
      since they are no longer needed.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      614732ea
    • T
      openvswitch: Abstract vport name through ovs_vport_name() · c9db965c
      Thomas Graf 提交于
      This allows to get rid of the get_name() vport ops later on.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9db965c
    • T
      openvswitch: Move dev pointer into vport itself · be4ace6e
      Thomas Graf 提交于
      This is the first step in representing all OVS vports as regular
      struct net_devices. Move the net_device pointer into the vport
      structure itself to get rid of struct vport_netdev.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be4ace6e
    • T
      openvswitch: Make tunnel set action attach a metadata dst · 34ae932a
      Thomas Graf 提交于
      Utilize the new metadata dst to attach encapsulation instructions to
      the skb. The existing egress_tun_info via the OVS_CB() is left in
      place until all tunnel vports have been converted to the new method.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34ae932a
    • T
      vxlan: Factor out device configuration · 0dfbdf41
      Thomas Graf 提交于
      This factors out the device configuration out of the RTNL newlink
      API which allows for in-kernel creation of VXLAN net_devices.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0dfbdf41
    • T
      fib: Add fib rule match on tunnel id · e7030878
      Thomas Graf 提交于
      This add the ability to select a routing table based on the tunnel
      id which allows to maintain separate routing tables for each virtual
      tunnel network.
      
      ip rule add from all tunnel-id 100 lookup 100
      ip rule add from all tunnel-id 200 lookup 200
      
      A new static key controls the collection of metadata at tunnel level
      upon demand.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7030878
    • T
      route: Per route IP tunnel metadata via lightweight tunnel · 3093fbe7
      Thomas Graf 提交于
      This introduces a new IP tunnel lightweight tunnel type which allows
      to specify IP tunnel instructions per route. Only IPv4 is supported
      at this point.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3093fbe7
    • T
      route: Extend flow representation with tunnel key · 1b7179d3
      Thomas Graf 提交于
      Add a new flowi_tunnel structure which is a subset of ip_tunnel_key to
      allow routes to match on tunnel metadata. For now, the tunnel id is
      added to flowi_tunnel which allows for routes to be bound to specific
      virtual tunnels.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b7179d3
    • T
      vxlan: Flow based tunneling · ee122c79
      Thomas Graf 提交于
      Allows putting a VXLAN device into a new flow-based mode in which
      skbs with a ip_tunnel_info dst metadata attached will be encapsulated
      according to the instructions stored in there with the VXLAN device
      defaults taken into consideration.
      
      Similar on the receive side, if the VXLAN_F_COLLECT_METADATA flag is
      set, the packet processing will populate a ip_tunnel_info struct for
      each packet received and attach it to the skb using the new metadata
      dst.  The metadata structure will contain the outer header and tunnel
      header fields which have been stripped off. Layers further up in the
      stack such as routing, tc or netfitler can later match on these fields
      and perform forwarding. It is the responsibility of upper layers to
      ensure that the flag is set if the metadata is needed. The flag limits
      the additional cost of metadata collecting based on demand.
      
      This prepares the VXLAN device to be steered by the routing and other
      subsystems which allows to support encapsulation for a large number
      of tunnel endpoints and tunnel ids through a single net_device which
      improves the scalability.
      
      It also allows for OVS to leverage this mode which in turn allows for
      the removal of the OVS specific VXLAN code.
      
      Because the skb is currently scrubed in vxlan_rcv(), the attachment of
      the new dst metadata is postponed until after scrubing which requires
      the temporary addition of a new member to vxlan_metadata. This member
      is removed again in a later commit after the indirect VXLAN receive API
      has been removed.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee122c79