1. 18 12月, 2013 5 次提交
    • W
      sctp: Reorder 'struc association' members to reduce its size · be78cfcb
      wangweidong 提交于
      Members of 'struct association' are not in appropriate order to
      reuse compiler added padding on 64bit architectures. In this patch
      we reorder those struct members and help reduce the size of the
      structure from 2776 bytes to 2720 bytes on 64 bit architectures.
      Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be78cfcb
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · e4379310
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates
      
      This series contains updates to i40e only (again).
      
      Jesse provides a fix for when tx_rings structure is NULL and we do not want
      to panic. Then refactors the flow control set up and disables L2 flow control
      by default.  Provides some trivial fixes as well as prevent compiler warnings.
      Then to align to similar behaviour in ixgbe, use the total number of CPUs in
      the system to suggest the number of transmit and receive queue pairs.
      
      Shannon provides a i40e ethtool fix to get some more reasonable information
      reports back out to the ethtool.  In addition, fixes PF reset after offline
      test, where it reorders the test to put the register test last as it is the
      only one that needs a reset, and we wait to trigger the reset until after we
      clear the testing bit.  Lastly provides basic support for handling suspend
      and resume for now, later on Wake-On-LAN support will be added.
      
      Anjali provides changes to tell the stack about our actual number of queues
      in order for RFS/RPS/XFS to work correctly.  Then provides several patches to
      implement dynamically changing the queue count for the main VSI.  Adds
      basic support for get/set channels for RSS so that the number of receive and
      transmit queue pair can be changed via ethtool.  Cleans up the use of
      rtnl_lock in the reset patch since it runs from a work time.
      
      Neerav Parikh cleans up the VF interface to remove FCoE code as this
      feature will not be supported on VF interfaces.
      
      v2:
        - submitted patch 1 to net (since it was a fix needed for net), so dropped
          from this series (this patch will get added to net-next when Dave syncs
          his trees)
        - Dropped patches 4 & 11 from previous submission because of feedback
          received from Ben Hutchings and Sergei Shtylyov.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4379310
    • D
      Merge branch 'ovs_hash' · bc4d0f61
      David S. Miller 提交于
      Francesco Fusco says:
      
      ====================
      ovs: introduce arch-specific fast hashing improvements
      
      From: Daniel Borkmann <dborkman@redhat.com>
      
      We are introducing a fast hash function (see patch1) that can be
      used in the context of OpenVSwitch to reduce the hashing footprint
      (patch2). For details, please see individual patches!
      
      v1->v2:
       - Make hash generic and place it under lib
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc4d0f61
    • F
      net: ovs: use CRC32 accelerated flow hash if available · 500f8087
      Francesco Fusco 提交于
      Currently OVS uses jhash2() for calculating flow hashes in its
      internal flow_hash() function. The performance of the flow_hash()
      function is critical, as the input data can be hundreds of bytes
      long.
      
      OVS is largely deployed in x86_64 based datacenters.  Therefore,
      we argue that the performance critical fast path of OVS should
      exploit underlying CPU features in order to reduce the per packet
      processing costs. We replace jhash2 with the hash implementation
      provided by the kernel hash lib, which exploits the crc32l
      instruction to achieve high performance
      
      Our patch greatly reduces the hash footprint from ~200 cycles of
      jhash2() to around ~90 cycles in case of ovs_flow_hash_crc()
      (measured with rdtsc over maximum length flow keys on an i7 Intel
      CPU).
      
      Additionally, we wrote a microbenchmark to stress the flow table
      performance. The benchmark inserts random flows into the flow
      hash and then performs lookups. Our hash deployed on a CRC32
      capable CPU reduces the lookup for 1000 flows, 100 masks from
      ~10,100us to ~6,700us, for example.
      
      Thus, simply use the newly introduced arch_fast_hash2() as a
      drop-in replacement.
      Signed-off-by: NFrancesco Fusco <ffusco@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NThomas Graf <tgraf@redhat.com>
      Acked-by: NJesse Gross <jesse@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      500f8087
    • F
      lib: introduce arch optimized hash library · 71ae8aac
      Francesco Fusco 提交于
      We introduce a new hashing library that is meant to be used in
      the contexts where speed is more important than uniformity of the
      hashed values. The hash library leverages architecture specific
      implementation to achieve high performance and fall backs to
      jhash() for the generic case.
      
      On Intel-based x86 architectures, the library can exploit the crc32l
      instruction, part of the Intel SSE4.2 instruction set, if the
      instruction is supported by the processor. This implementation
      is twice as fast as the jhash() implementation on an i7 processor.
      
      Additional architectures, such as Arm64 provide instructions for
      accelerating the computation of CRC, so they could be added as well
      in follow-up work.
      Signed-off-by: NFrancesco Fusco <ffusco@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NThomas Graf <tgraf@redhat.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71ae8aac
  2. 17 12月, 2013 5 次提交
  3. 16 12月, 2013 13 次提交
  4. 15 12月, 2013 1 次提交
  5. 14 12月, 2013 16 次提交
    • H
      ipv6: fix compiler warning in ipv6_exthdrs_len · f52d81dc
      Hannes Frederic Sowa 提交于
      Commit 299603e8 ("net-gro: Prepare GRO
      stack for the upcoming tunneling support") used an uninitialized variable
      which leads to the following compiler warning:
      
      net/ipv6/ip6_offload.c: In function ‘ipv6_gro_complete’:
      net/ipv6/ip6_offload.c:178:24: warning: ‘optlen’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          opth = (void *)opth + optlen;
                              ^
      net/ipv6/ip6_offload.c:164:22: note: ‘optlen’ was declared here
        int len = 0, proto, optlen;
                            ^
      Fix it up.
      
      Cc: Jerry Chu <hkchu@google.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f52d81dc
    • D
      Merge branch 'bonding_rcu' · df012169
      David S. Miller 提交于
      Ding Tianhong says:
      
      ====================
      bonding: rebuild the lock use for bond monitor
      
      Now the bond slave list is not protected by bond lock, only by RTNL,
      but the monitor still use the bond lock to protect the slave list,
      it is useless, according to the Veaceslav's opinion, there were
      three way to fix the protect problem:
      
      1. add bond_master_upper_dev_link() and bond_upper_dev_unlink()
         in bond->lock, but it is unsafe to call call_netdevice_notifiers()
         in write lock.
      2. remove unused bond->lock for monitor function, only use the exist
         rtnl lock(), it will take performance loss in fast path.
      3. use RCU to protect the slave list, of course, performance is better,
         but in slow path, it is ignored.
      
      obviously the solution 1 is not fit here, I will consider the 2 and 3
      solution. My principle is simple, if in fast path, RCU is better,
      otherwise in slow path, both is well, but according to the Jay Vosburgh's
      opinion, the monitor will loss performace if use RTNL to protect the all
      slave list, so remove the bond lock and replace with RCU.
      
      The second problem is the curr_slave_lock for bond, it is too old and
      unwanted in many place, because the curr_active_slave would only be
      changed in 3 place:
      
      1. enslave slave.
      2. release slave.
      3. change active slave.
      
      all above were already holding bond lock, RTNL and curr_slave_lock
      together, it is tedious and no need to add so mach lock, when change
      the curr_active_slave, you have to hold the RTNL and curr_slave_lock
      together, and when you read the curr_active_slave, RTNL or curr_slave_lock,
      any one of them is no problem.
      
      for the stability, I did not change the logic for the monitor,
      all change is clear and simple, I have test the patch set for lockdep,
      it work well and stability.
      
      v2. accept the Jay Vosburgh's opinion, remove the RTNL and replace with RCU,
          also add some rcu function for bond use, so the patch set reach 10.
      
      v3. accept the Nikolay Aleksandrov's opinion, remove no needed bond_has_slave_rcu(),
          add protection for several 3ad mode handler functions and current_arp_slave.
          rebuild the bond_first_slave_rcu(), make it more clear.
      
      v4. because the struct netdev_adjacent should not be exist in netdevice.h, so I have
          to make a new function to support micro bond_first_slave_rcu().
          also add a new patch to simplify the bond_resend_igmp_join_requests_delayed().
      
      v5. according the Jay Vosburgh's opinion, in patch 2 and 6, the calling of notify
          peer is hardly to happen with the bond_xxx_commit() when the monitoring is running,
          so the performance impact about make two round trips to one trip on RTNL is minimal,
          no need to do that,the reason is very clear, so modify the patch 2 and 6, recover
          the notify peer in RTNL alone.
      ====================
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df012169
    • D
      bonding: rebuild the bond_resend_igmp_join_requests_delayed() · f2369109
      dingtianhong 提交于
      The bond_resend_igmp_join_requests_delayed() and
      bond_resend_igmp_join_requests() should be integrated,
      because the bond_resend_igmp_join_requests_delayed() did
      nothing except bond_resend_igmp_join_requests().
      
      The bond igmp_retrans could only be changed in bond_change_active_slave
      and here, bond_change_active_slave will be called in RTNL and curr_slave_lock,
      the bond_resend_igmp_join_requests already hold RTNL, so no need
      to free RTNL and hold curr_slave_lock again, it may be a small optimization,
      so move the igmp_retrans in RTNL and remove the curr_slave_lock.
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2369109
    • D
      bonding: remove unwanted lock for bond_store_primaryxxx() · 75ad932c
      dingtianhong 提交于
      The bond_select_active_slave() will not release and acquire
      bond lock, so it is no need to read the bond lock for them,
      and the bond_store_primaryxxx() is already in RTNL, so remove the
      unwanted lock.
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75ad932c
    • D
      bonding: remove unwanted lock for bond_option_active_slave_set() · 4e789fc1
      dingtianhong 提交于
      The bond_option_active_slave_set() is always called in RTNL,
      the RTNL could protect bond slave list, so remove the unwanted
      bond lock.
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e789fc1
    • D
      bonding: add RCU for bond_3ad_state_machine_handler() · be79bd04
      dingtianhong 提交于
      The bond_3ad_state_machine_handler() use the bond lock to protect
      the bond slave list and slave port together, but it is not enough,
      the bond slave list was link and unlink in RTNL, not bond lock,
      so I add RCU to protect the slave list from leaving.
      
      The bond lock is still used here, because when the slave has been
      removed from the list by the time the state machine runs, it appears
      to be possible for both function to manupulate the same aggregator->lag_ports
      by finding the aggregator via two different ports that are both members of
      that aggregator (i.e., port A of the agg is being unbound, and port B
      of the agg is runing its state machine).
      
      If I remove the bond lock, there are nothing to mutex changes
      to aggregator->lag_ports between bond_3ad_state_machine_handler and
      bond_3ad_unbind_slave, So the bond lock is the simplest way to protect
      aggregator->lag_ports.
      
      There was a lot of function need RCU protect, I have two choice
      to make the function in RCU-safe, (1) create new similar functions
      and make the bond slave list in RCU. (2) modify the existed functions
      and make them in read-side critical section, because the RCU
      read-side critical sections may be nested.
      
      I choose (2) because it is no need to create more similar functions.
      
      The nots in the function is still too old, clean up the nots.
      Suggested-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be79bd04
    • D
      bonding: remove unwanted lock for bond enslave and release · c8517035
      dingtianhong 提交于
      The bond_change_active_slave() and bond_select_active_slave()
      do't need bond lock anymore, so remove the unwanted bond lock
      for these two functions.
      
      The bond_select_active_slave() will release and acquire
      curr_slave_lock, so the curr_slave_lock need to protect
      the function.
      
      In bond enslave and bond release, the bond slave list is also
      protected by RTNL, so bond lock is no need to exist, remove
      the lock and clean the functions.
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8517035
    • D
      bonding: rebuild the lock use for bond_activebackup_arp_mon() · eb9fa4b0
      dingtianhong 提交于
      The bond_activebackup_arp_mon() use the bond lock for read to
      protect the slave list, it is no effect, and the RTNL is only
      called for bond_ab_arp_commit() and peer notify, for the performance
      better, use RCU to replace with the bond lock, to the bond slave
      list need to called in RCU, add a new bond_first_slave_rcu()
      to get the first slave in RCU protection.
      
      In bond_ab_arp_probe(), the bond->current_arp_slave may changd
      if bond release slave, just like:
      
              bond_ab_arp_probe()                     bond_release()
              cpu 0                                   cpu 1
              ...
              if (bond->current_arp_slave...)         ...
              ...                             bond->current_arp_slave = NULl
              bond->current_arp_slave->dev->name      ...
      
      So the current_arp_slave need to dereference in the section.
      Suggested-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb9fa4b0
    • D
      bonding: create bond_first_slave_rcu() · e001bfad
      dingtianhong 提交于
      The bond_first_slave_rcu() will be used to instead of bond_first_slave()
      in rcu_read_lock().
      
      According to the Jay Vosburgh's suggestion, the struct netdev_adjacent
      should hide from users who wanted to use it directly. so I package a
      new function to get the first slave of the bond.
      Suggested-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e001bfad
    • D
      bonding: rebuild the lock use for bond_loadbalance_arp_mon() · 2e52f4fe
      dingtianhong 提交于
      The bond_loadbalance_arp_mon() use the bond lock to protect the
      bond slave list, it is no effect, so I could use RTNL or RCU to
      replace it, considering the performance impact, the RCU is more
      better here, so the bond lock replace with the RCU.
      
      The bond_select_active_slave() need RTNL and curr_slave_lock
      together, but there is no RTNL lock here, so add a rtnl_rtylock.
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e52f4fe
    • D
      bonding: rebuild the lock use for bond_alb_monitor() · 733ab639
      dingtianhong 提交于
      The bond_alb_monitor use bond lock to protect the bond slave list,
      it is no effect here, we need to use RTNL or RCU to replace bond lock,
      the bond_alb_monitor will called 10 times one second, RTNL may loss
      performance here, so I replace bond lock with RCU to protect the
      bond slave list, also the RTNL is preserved, the logic of the monitor
      did not changed.
      Suggested-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      733ab639
    • D
      bonding: rebuild the lock use for bond_mii_monitor() · 4cb4f97b
      dingtianhong 提交于
      The bond_mii_monitor() still use bond lock to protect bond slave list,
      it is no effect, I have 2 way to fix the problem, move the RTNL to the
      top of the function, or add RCU to protect the bond slave list,
      according to the Jay Vosburgh's opinion, 10 times one second is a
      truely big performance loss if use RTNL to protect the whole monitor,
      so I would take the advice and use RCU to protect the bond slave list.
      
      The bond_has_slave() will not protect by anything, there will no things
      happen if the slave list is be changed, unless the bond was free, but
      it will not happened before the monitor, the bond will closed before
      be freed.
      
      The peers notify for the bond will calling curr_active_slave, so
      derefence the slave to make sure we will accessing the same slave
      if the curr_active_slave changed, as the rcu dereference need in
      read-side critical sector and bond_change_active_slave() will call
      it with no RCU hold,  so add peer notify in rcu_read_lock which
      will be nested in monitor.
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cb4f97b
    • D
      bonding: remove the no effect lock for bond_select_active_slave() · b2e7aceb
      dingtianhong 提交于
      The bond slave list was no longer protected by bond lock and only
      protected by RTNL or RCU, so anywhere that use bond lock to protect
      slave list is meaningless.
      
      remove the release and acquire bond lock for bond_select_active_slave().
      
      The curr_active_slave could only be changed in 3 place:
      
      1. enslave slave.
      2. release slave.
      3. change_active_slave.
      
      all above place were holding bond lock, RTNL and curr_slave_lock
      together, it is tedious and meaningless, obviously bond lock is no
      need here, but RTNL or curr_slave_lock is needed, so if you want
      to access active slave, you have to choose one lock, RTNL or
      curr_slave_lock, if RTNL is exist, no need to add curr_slave_lock,
      otherwise curr_slave_lock is better, because of the performance.
      
      there are several place calling bond_select_active_slave() and
      bond_change_active_slave(), the next step I will clean these place
      and remove the no effect lock.
      
      there are some document changed together when update the function.
      Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
      Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2e7aceb
    • E
      pkt_sched: set root qdisc before change() in attach_default_qdiscs() · e57a784d
      Eric Dumazet 提交于
      After commit 95dc1929 ("pkt_sched: give visibility to mq slave
      qdiscs") we call disc_list_add() while the device qdisc might be
      the noop_qdisc one.
      
      This shows up as duplicates in "tc qdisc show", as all inactive devices
      point to noop_qdisc.
      
      Fix this by setting dev->qdisc to the new qdisc before calling
      ops->change() in attach_default_qdiscs()
      
      Add a WARN_ON_ONCE() to catch any future similar problem.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e57a784d
    • D
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next · 59bcaed5
      David S. Miller 提交于
      Ben Hutchings says:
      
      ====================
      An assortment of changes for Linux 3.14:
      
      1. Merge the sfc fixes that you have already merged into net.git.
         (The branch point for those was such that this does not bring in any
         other changes.)
      2. Reduce log level for a generally useless warning message, from
         Robert Stonehouse.
      3. Include BISTs in ethtool offline self-test for EF10 and recover from
         BISTs initiated through other functions, from Jon Cooper.
      4. Improve a sanity check on RX completions.
      5. Avoid incrementing RX dropped count while the interface is down, from
         Jon Cooper.
      6. Improve hardware sensor naming and log messages, from Edward Cree.
      7. Log all unexpected errors returned by firmware, from Edward Cree.
      8. Expose another NVRAM partition to userland.
      9. Some refactoring of the PTP code in preparation for EF10 support.
      10. Various minor cleanups.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59bcaed5
    • D
      Merge branch 'bonding_netlink' · 0aac68f7
      David S. Miller 提交于
      Scott Feldman says:
      
      ====================
      bonding: add more netlink attributes
      
      v2:
      
      Addressed v1 review comments.  In particular, Jay's concern about
      current sysfs ordering limitations carrying over to iproute.  Netlink
      attributes are processed in a priority order in
      bond_netlink.c:bond_changelink().  Lower priority attributes can't undo
      higher priority attributes when attempting to set both with iproute
      command.  For example, this command will fail:
      
        ip link add bond1 type bond mode active-backup miimon 10 arp_interval 10
      
      Because we're trying to create a new bond to use incompatible miimon
      and ARP interval attributes.  However, if attributes are applied
      one-at-a-time, previously applied attributes can be overridden:
      
        ip link add bond1 type bond mode active-backup miimon 10
        ip link set dev bond1 type bond arp_interval 10
      
      These two commands succeed.  The bond is first created to use miimon.
      Next, the bond is converted to use ARP interval, which undoes miimon.
      
      v1:
      
      Following Jiri Pirko's lead, add more bonding netlink attributes.  Sending
      matching iproute2 patch separately.  sysfs access to attributes is
      retained.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0aac68f7