1. 28 2月, 2015 10 次提交
    • J
      team: allow TSO being set on master · 247f6d0f
      Jiri Pirko 提交于
      This patch allows TSO being set/unset on the master, so that GSO
      segmentation is done after team layer.
      
      Similar patch is present for bonding:
      	b0ce3508 ("bonding: allow TSO being set on bonding master")
      and bridge:
      	f902e881 ("bridge: Add ability to enable TSO")
      Suggested-by: NJiri Prochazka <jprochaz@redhat.com>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      247f6d0f
    • D
      Merge branch 'fib_trie_remove_leaf_info' · 7eb60345
      David S. Miller 提交于
      Alexander Duyck says:
      
      ====================
      fib_trie: Remove leaf_info structure
      
      This patch set removes the leaf_info structure from the IPv4 fib_trie.  The
      general idea is that the leaf_info structure itself only held about 6
      actual bits of data, beyond that it was mostly just waste.  As such we can
      drop the structure, move the 1 byte representing the prefix/suffix length
      into the fib_alias and just link it all into one list.
      
      My testing shows that this saves somewhere between 4 to 10ns depending on
      the type of test performed.  I'm suspecting that this represents 1 to 2 L1
      cache misses saved per look-up.
      
      One side effect of this change is that semantic_match_miss will now only
      increment once per leaf instead of once per leaf_info miss.  However the
      stat is already skewed now that we perform a preliminary check on the leaf
      as a part of the look-up.
      
      I also have gone through and addressed a number of ordering issues in the
      first patch since I had misread the behavior of list_add_tail.
      
      I have since run some additional testing and verified the resulting lists
      are in the same order when combining multiple prefix length and tos values
      in a single leaf.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7eb60345
    • A
      fib_trie: Remove leaf_info · 79e5ad2c
      Alexander Duyck 提交于
      At this point the leaf_info hash is redundant.  By adding the suffix length
      to the fib_alias hash list we no longer have need of leaf_info as we can
      determine the prefix length from fa_slen.  So we can compress things by
      dropping the leaf_info structure from fib_trie and instead directly connect
      the leaves to the fib_alias hash list.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79e5ad2c
    • A
      fib_trie: Add slen to fib alias · 9b6ebad5
      Alexander Duyck 提交于
      Make use of an empty spot in the alias to store the suffix length so that
      we don't need to pull that information from the leaf_info structure.
      
      This patch also makes a slight change to the user statistics.  Instead of
      incrementing semantic_match_miss once per leaf_info miss we now just
      increment it once per leaf if a match was not found.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b6ebad5
    • A
      fib_trie: Replace plen with slen in leaf_info · 5786ec60
      Alexander Duyck 提交于
      This replaces the prefix length variable in the leaf_info structure with a
      suffix length value, or host identifier length in bits.  By doing this it
      makes it easier to sort out since the tnodes and leaf are carrying this
      value as well since it is compatible with the ->pos field in tnodes.
      
      I also cleaned up one spot that had some list manipulation that could be
      simplified.  I basically updated it so that we just use hlist_add_head_rcu
      instead of calling hlist_add_before_rcu on the first node in the list.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5786ec60
    • A
      fib_trie: Convert fib_alias to hlist from list · 56315f9e
      Alexander Duyck 提交于
      There isn't any advantage to having it as a list and by making it an hlist
      we make the fib_alias more compatible with the list_info in terms of the
      type of list used.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56315f9e
    • D
      Merge branch 'ip_level_multicast_join_leave' · 7705f730
      David S. Miller 提交于
      Madhu Challa says:
      
      ====================
      Multicast group join/leave at ip level
      
      This series enables configuring multicast group join/leave at ip level
      by extending the "ip address" command.
      
      It adds a new control socket mc_autojoin_sock and ifa_flag IFA_F_MCAUTOJOIN
      to invoke the corresponding igmp group join/leave api.
      
      Since the igmp group join/leave api takes the rtnl_lock the code had to
      be refactored by adding a shim layer prefixed by __ that can be invoked
      by code that already has the rtnl_lock. This way we avoid proliferation of
      work queues.
      
      The first patch in this series does the refactoring for igmp v6.
      Its based on igmp v4 changes that were added by Eric Dumazet.
      
      The second patch in this series does the group join/leave based on the
      setting of the IFA_F_MCAUTOJOIN flag.
      
      v5:
      - addressed comments from Daniel Borkmann.
       - removed blank line in patch 1/2
       - removed unused variable, const arg in patch 2/2
      v4:
      - addressed comments from Yoshifuji Hideaki.
       - Remove WARN_ON not needed because we return a value from v2.
      - addressed comments from Daniel Borkmann.
       - rename sock to mc_autojoin_sk
       - ip_mc_config() pass ifa so it needs one less argument.
       - igmp_net_{init|destroy}() use inet_ctl_sock_{create|destroy}
       - inet_rtm_newaddr() change scope of ret.
       - igmp_net_init() no need to initialize sock to NULL.
      v3:
      - addressed comments from David Miller.
       - fixed indentation and local variable order.
      v2:
      - addressed comments from Eric Dumazet.
       - removed workqueue and call __ip_mc_{join|leave}_group or
         __ipv6_sock_mc_{join|drop}
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7705f730
    • M
      multicast: Extend ip address command to enable multicast group join/leave on · 93a714d6
      Madhu Challa 提交于
      Joining multicast group on ethernet level via "ip maddr" command would
      not work if we have an Ethernet switch that does igmp snooping since
      the switch would not replicate multicast packets on ports that did not
      have IGMP reports for the multicast addresses.
      
      Linux vxlan interfaces created via "ip link add vxlan" have the group option
      that enables then to do the required join.
      
      By extending ip address command with option "autojoin" we can get similar
      functionality for openvswitch vxlan interfaces as well as other tunneling
      mechanisms that need to receive multicast traffic. The kernel code is
      structured similar to how the vxlan driver does a group join / leave.
      
      example:
      ip address add 224.1.1.10/24 dev eth5 autojoin
      ip address del 224.1.1.10/24 dev eth5
      Signed-off-by: NMadhu Challa <challa@noironetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93a714d6
    • M
      igmp v6: add __ipv6_sock_mc_join and __ipv6_sock_mc_drop · 46a4dee0
      Madhu Challa 提交于
      Based on the igmp v4 changes from Eric Dumazet.
      959d10f6("igmp: add __ip_mc_{join|leave}_group()")
      
      These changes are needed to perform igmp v6 join/leave while
      RTNL is held.
      
      Make ipv6_sock_mc_join and ipv6_sock_mc_drop wrappers around
      __ipv6_sock_mc_join and  __ipv6_sock_mc_drop to avoid
      proliferation of work queues.
      Signed-off-by: NMadhu Challa <challa@noironetworks.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46a4dee0
    • T
      udp: In udp_flow_src_port use random hash value if skb_get_hash fails · 723b8e46
      Tom Herbert 提交于
      In the unlikely event that skb_get_hash is unable to deduce a hash
      in udp_flow_src_port we use a consistent random value instead.
      This is specified in GRE/UDP draft section 3.2.1:
      https://tools.ietf.org/html/draft-ietf-tsvwg-gre-in-udp-encap-04Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      723b8e46
  2. 27 2月, 2015 5 次提交
  3. 26 2月, 2015 6 次提交
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · 009f33ed
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2015-02-24
      
      This series contains updates to i40e and i40evf only, which bumps their
      versions to i40e 1.2.9 and i40evf 1.2.3.
      
      Paul fixes i40e_debug_aq() for big endian machines by adding the
      appropriate LExx_TO_CPU wrappers.
      
      Catherine adds a requested speed variable to the link_status to store the
      last speeds we requested from the firmware and use the advertised speed
      settings in get_settings in ethtool now that we have it.  Due to the
      new code addition, she also refactors get_settings to improve readability
      and to accommodate some of the longer lines of code by adding two
      functions i40e_get_settings_link_up() and i40e_get_settings_link_down().
      
      Carolyn adds a struct to the VSI struct to keep track of RXNFC settings
      done via ethtool.  Adds more information to the interrupt vector
      names, specifically to the VF misc vector name so that we can distinguish
      between all the interrupts.
      
      Ashish enables the i40evf driver to enable debug prints via ethtool.
      
      Mitch updates i40e to enable packet split only when IOMMU is in use,
      since it shows a distinct advantage over the single-buffer path
      because it minimizes DMA mapping and unmapping.  Also adds the receive
      routine in use to the features log message to be able to print the
      receive packet split status.
      
      Greg adds the ability to get, set and commit permanently the NPAR
      partition BW configuration through configfs.  Enables an application
      to query the i40e driver's private flags to get the status of NPAR
      enablement via ethtool.
      
      Neerav adds support for bridge offload ndo_ops getlink and setlink
      to enable bridge hardware mode as per the mode set via IFLA_BRIDGE_MODE.
      The support is only enabled in the case of a PF VSI and not available for
      any other VSI type.
      
      Kevin fixes i40e by ensuring the BUF and FLAG_RD flags are set for
      indirect admin queue command.
      
      Vasu updates the driver to setup FCoE netdev device type as "fcoe", so that
      it shows up in sysfs as FCoE device.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      009f33ed
    • G
      net: dsa: Introduce dsa_is_port_initialized · d79d2107
      Guenter Roeck 提交于
      To avoid race conditions when using the ds->ports[] array,
      we need to check if the accessed port has been initialized.
      Introduce and use helper function dsa_is_port_initialized
      for that purpose and use it where needed.
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d79d2107
    • D
      Merge branch 'sf2_hwbridge' · bb66be1c
      David S. Miller 提交于
      Florian Fainelli says:
      
      ====================
      net: dsa: integration with SWITCHDEV for HW bridging
      
      This patch set provides the DSA and SWITCHDEV integration bits together and
      modifies the bcm_sf2 driver accordingly such that it works properly with HW
      bridging.
      
      Changes in v3:
      
      - add back the null pointer check in dsa_slave_br_port_mask from Guenter
      - slightly rework patch 1 commit message not to mention the function name
        we add in patch 2
      
      Changes in v2:
      
      - avoid a race condition in how DSA network devices are created, patch from
        Guenter Roeck
      - provide a consistent and work STP state once a port leaves the bridge
      - retain a bridge device pointer to properly flag port/bridge membership
      - properly flush the ARL (Address Resolution Logic) in bcm_sf2.c
      - properly retain port membership when individually bringing devices up/down
        while they are members of a bridge
      
      We discussed on the mailing-list the possibility of standardizing a "fdb_flush"
      operation for DSA switch drivers, looking at the Marvell and Broadcom switches,
      I am not convinced this is practical or diserable as the terminologies vary
      here, but there is nothing preventing us from doing it later.
      
      Many thanks to Guenter and Andrew for both testing and providing feedback.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb66be1c
    • F
      net: dsa: bcm_sf2: add HW bridging support · 12f460f2
      Florian Fainelli 提交于
      Implement the bridge join, leave and set_stp callbacks by making that
      we do the following:
      
      - when a port joins the bridge, all existing ports in the bridge get
        their VLAN control register updated with that joining port
      - the joining port is including all existing bridge ports in its own
        VLAN control register
      
      The leave operation is fairly similar, special care must be taken to
      make sure that port leaving the bridging is not removing itself from its
      own VLAN control register.
      
      Since the various BR_* states apply directly to our HW semantics, we
      just need to translate these constants into their corresponding HW
      settings, and voila!
      
      We make sure to trigger a fast-ageing process for ports that are
      joining/leaving the bridge and transition from incompatible states, this
      is equivalent to triggering an ARL flush for that port.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12f460f2
    • F
      net: dsa: integrate with SWITCHDEV for HW bridging · b73adef6
      Florian Fainelli 提交于
      In order to support bridging offloads in DSA switch drivers, select
      NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
      NDOs that we are required to implement.
      
      To facilitate the integratation at the DSA driver level, we implement 3
      types of operations:
      
      - port_join_bridge
      - port_leave_bridge
      - port_stp_update
      
      DSA will resolve which switch ports that are currently bridge port
      members as some Switch hardware/drivers need to know about that to limit
      the register programming to just the relevant registers (especially for
      slow MDIO buses).
      
      We also take care of setting the correct STP state when slave network
      devices are brought up/down while being bridge members.
      
      Finally, when a port is leaving the bridge, we make sure we set in
      BR_STATE_FORWARDING state, otherwise the bridge layer would leave it
      disabled as a result of having left the bridge.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b73adef6
    • G
      net: dsa: Ensure that port array elements are initialized before being used · d87d6f44
      Guenter Roeck 提交于
      A network device notifier can be called for one or more of the created
      slave devices before all slave devices have been registered. This can
      result in a mismatch between ds->phys_port_mask and the registered devices
      by the time the call is made, and it can result in a slave device being
      added to a bridge before its entry in ds->ports[] has been initialized.
      
      Rework the initialization code to initialize entries in ds->ports[] in
      dsa_slave_create. With this change, dsa_slave_create no longer needs
      to return slave_dev but can return an error code instead.
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d87d6f44
  4. 25 2月, 2015 19 次提交