1. 21 10月, 2015 5 次提交
    • Y
      tcp: add tcp_tsopt_ecr_before helper · 77c63127
      Yuchung Cheng 提交于
      a helper to prepare the main RACK patch
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77c63127
    • Y
      tcp: remove tcp_mark_lost_retrans() · af82f4e8
      Yuchung Cheng 提交于
      Remove the existing lost retransmit detection because RACK subsumes
      it completely. This also stops the overloading the ack_seq field of
      the skb control block.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af82f4e8
    • Y
      tcp: track min RTT using windowed min-filter · f6722583
      Yuchung Cheng 提交于
      Kathleen Nichols' algorithm for tracking the minimum RTT of a
      data stream over some measurement window. It uses constant space
      and constant time per update. Yet it almost always delivers
      the same minimum as an implementation that has to keep all
      the data in the window. The measurement window is tunable via
      sysctl.net.ipv4.tcp_min_rtt_wlen with a default value of 5 minutes.
      
      The algorithm keeps track of the best, 2nd best & 3rd best min
      values, maintaining an invariant that the measurement time of
      the n'th best >= n-1'th best. It also makes sure that the three
      values are widely separated in the time window since that bounds
      the worse case error when that data is monotonically increasing
      over the window.
      
      Upon getting a new min, we can forget everything earlier because
      it has no value - the new min is less than everything else in the
      window by definition and it's the most recent. So we restart fresh
      on every new min and overwrites the 2nd & 3rd choices. The same
      property holds for the 2nd & 3rd best.
      
      Therefore we have to maintain two invariants to maximize the
      information in the samples, one on values (1st.v <= 2nd.v <=
      3rd.v) and the other on times (now-win <=1st.t <= 2nd.t <= 3rd.t <=
      now). These invariants determine the structure of the code
      
      The RTT input to the windowed filter is the minimum RTT measured
      from ACK or SACK, or as the last resort from TCP timestamps.
      
      The accessor tcp_min_rtt() returns the minimum RTT seen in the
      window. ~0U indicates it is not available. The minimum is 1usec
      even if the true RTT is below that.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6722583
    • Y
      tcp: apply Kern's check on RTTs used for congestion control · 9e45a3e3
      Yuchung Cheng 提交于
      Currently ca_seq_rtt_us does not use Kern's check. Fix that by
      checking if any packet acked is a retransmit, for both RTT used
      for RTT estimation and congestion control.
      
      Fixes: 5b08e47c ("tcp: prefer packet timing to TS-ECR for RTT")
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e45a3e3
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · c8fdc324
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2015-10-19
      
      This series contains updates to i40e and i40evf only.
      
      Kiran adds a spinlock around code accessing VSI MAC filter list to
      ensure that we are synchronizing access to the filter list, otherwise
      we can end up with multiple accesses at the same time which can cause
      the VSI MAC filter list to get in an unstable or corrupted state.
      
      Jesse fixes overlong BIT defines, where the RSS enabling call were
      mistakenly missed.  Also fixes a bug where the enable function was
      enabling the interrupt twice while trying to update the two interrupt
      throttle rate thresholds for Rx and Tx, while refactoring the IRQ
      enable function to simplify reading the flow.  Addressed the high
      CPU utilization of some small streaming workloads that the driver should
      reduce CPU in.
      
      Anjali fixes two X722 issues with respect to EEPROM checksum verify and
      reading NVM version info.  Fixed where a mask value was accidentally
      replaced with a bit mask causing Flow Director sideband to be broken.
      
      Alex Duyck fixes areas of the drivers which run from hard interrupt
      context or with interrupts already disabled in netpoll, so use
      napi_schedule_irqoff() instead of napi_schedule().
      
      Mitch fixes the VF drivers to not easily give up when it is not able
      to communicate with the PF driver.
      
      Carolyn fixes a problem where our tools MAC loopback test, after driver
      unbind would fail because the hardware was configured for multiqueue and
      unbind operation did not clear this configuration.  Also fixed a issue
      where the NVMUpdate tool gets bad data from the PHY when using the PHY
      NVM feature because of contention on the MDIO interface from getting
      PHY capability calls from the driver during regular operations.
      
      Catherine fixed an issue where we were checking if autoneg was allowed
      to change before checking if autoneg was changing, these checks need to
      be in the reverse order.
      
      Jean Sacren fixes up an function header comment to align the kernel-docs
      with the actual code.
      
      v2: Cleaned up the use of spin_is_locked() in patch 1 based on feedback
          from David Miller, since it always evaluates to zero on uni-processor
          builds
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8fdc324
  2. 20 10月, 2015 19 次提交
  3. 19 10月, 2015 16 次提交
    • F
      net: bcmgenet: Fix early link interrupt enabling · 37850e37
      Florian Fainelli 提交于
      Link interrupts are enabled in init_umac(), which is too early for us to
      process them since we do not yet have a valid PHY device pointer. On
      BCM7425 chips for instance, we will crash calling phy_mac_interrupt()
      because phydev is NULL.
      
      Fix this by moving the link interrupts enabling in
      bcmgenet_netif_start(), under a specific function:
      bcmgenet_link_intr_enable() and while at it, update the comments
      surrounding the code.
      
      Fixes: 6cc8e6d4 ("net: bcmgenet: Delay PHY initialization to bcmgenet_open()")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37850e37
    • D
      Merge tag 'wireless-drivers-for-davem-2015-10-17' of... · afc050dd
      David S. Miller 提交于
      Merge tag 'wireless-drivers-for-davem-2015-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      iwlwifi:
      
      * mvm: flush fw_dump_wk when mvm fails to start
      * mvm: init card correctly on ctkill exit check
      * pci: add a few more PCI subvendor IDs for the 7265 series
      * fix firmware filename for 3160
      * mvm: clear csa countdown when AP is stopped
      * mvm: fix D3 firmware PN programming
      * dvm: fix D3 firmware PN programming
      * mvm: fix D3 CCMP TX PN assignment
      
      rtlwifi:
      
      * rtl8821ae: Fix system lockups on boot
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      afc050dd
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 371f1c7e
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for your net-next
      tree. Most relevantly, updates for the nfnetlink_log to integrate with
      conntrack, fixes for cttimeout and improvements for nf_queue core, they are:
      
      1) Remove useless ifdef around static inline function in IPVS, from
         Eric W. Biederman.
      
      2) Simplify the conntrack support for nfnetlink_queue: Merge
         nfnetlink_queue_ct.c file into nfnetlink_queue_core.c, then rename it back
         to nfnetlink_queue.c
      
      3) Use y2038 safe timestamp from nfnetlink_queue.
      
      4) Get rid of dead function definition in nf_conntrack, from Flavio
         Leitner.
      
      5) Attach conntrack support for nfnetlink_log.c, from Ken-ichirou MATSUZAWA.
         This adds a new NETFILTER_NETLINK_GLUE_CT Kconfig switch that
         controls enabling both nfqueue and nflog integration with conntrack.
         The userspace application can request this via NFULNL_CFG_F_CONNTRACK
         configuration flag.
      
      6) Remove unused netns variables in IPVS, from Eric W. Biederman and
         Simon Horman.
      
      7) Don't put back the refcount on the cttimeout object from xt_CT on success.
      
      8) Fix crash on cttimeout policy object removal. We have to flush out
         the cttimeout extension area of the conntrack not to refer to an unexisting
         object that was just removed.
      
      9) Make sure rcu_callback completion before removing nfnetlink_cttimeout
         module removal.
      
      10) Fix compilation warning in br_netfilter when no nf_defrag_ipv4 and
          nf_defrag_ipv6 are enabled. Patch from Arnd Bergmann.
      
      11) Autoload ctnetlink dependencies when NFULNL_CFG_F_CONNTRACK is
          requested. Again from Ken-ichirou MATSUZAWA.
      
      12) Don't use pointer to previous hook when reinjecting traffic via
          nf_queue with NF_REPEAT verdict since it may be already gone. This
          also avoids a deadloop if the userspace application keeps returning
          NF_REPEAT.
      
      13) A bunch of cleanups for netfilter IPv4 and IPv6 code from Ian Morris.
      
      14) Consolidate logger instance existence check in nfulnl_recv_config().
      
      15) Fix broken atomicity when applying configuration updates to logger
          instances in nfnetlink_log.
      
      16) Get rid of the .owner attribute in our hook object. We don't need
          this anymore since we're dropping pending packets that have escaped
          from the kernel when unremoving the hook. Patch from Florian Westphal.
      
      17) Remove unnecessary rcu_read_lock() from nf_reinject code, we always
          assume RCU read side lock from .call_rcu in nfnetlink. Also from Florian.
      
      18) Use static inline function instead of macros to define NF_HOOK() and
          NF_HOOK_COND() when no netfilter support in on, from Arnd Bergmann.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      371f1c7e
    • S
      RDS: fix rds-ping deadlock over TCP transport · 7b4b0009
      santosh.shilimkar@oracle.com 提交于
      Sowmini found hang with rds-ping while testing RDS over TCP. Its
      a corner case and doesn't happen always. The issue is not reproducible
      with IB transport. Its clear from below dump why we see it with RDS TCP.
      
       [<ffffffff8153b7e5>] do_tcp_setsockopt+0xb5/0x740
       [<ffffffff8153bec4>] tcp_setsockopt+0x24/0x30
       [<ffffffff814d57d4>] sock_common_setsockopt+0x14/0x20
       [<ffffffffa096071d>] rds_tcp_xmit_prepare+0x5d/0x70 [rds_tcp]
       [<ffffffffa093b5f7>] rds_send_xmit+0xd7/0x740 [rds]
       [<ffffffffa093bda2>] rds_send_pong+0x142/0x180 [rds]
       [<ffffffffa0939d34>] rds_recv_incoming+0x274/0x330 [rds]
       [<ffffffff810815ae>] ? ttwu_queue+0x11e/0x130
       [<ffffffff814dcacd>] ? skb_copy_bits+0x6d/0x2c0
       [<ffffffffa0960350>] rds_tcp_data_recv+0x2f0/0x3d0 [rds_tcp]
       [<ffffffff8153d836>] tcp_read_sock+0x96/0x1c0
       [<ffffffffa0960060>] ? rds_tcp_recv_init+0x40/0x40 [rds_tcp]
       [<ffffffff814d6a90>] ? sock_def_write_space+0xa0/0xa0
       [<ffffffffa09604d1>] rds_tcp_data_ready+0xa1/0xf0 [rds_tcp]
       [<ffffffff81545249>] tcp_data_queue+0x379/0x5b0
       [<ffffffffa0960cdb>] ? rds_tcp_write_space+0xbb/0x110 [rds_tcp]
       [<ffffffff81547fd2>] tcp_rcv_established+0x2e2/0x6e0
       [<ffffffff81552602>] tcp_v4_do_rcv+0x122/0x220
       [<ffffffff81553627>] tcp_v4_rcv+0x867/0x880
       [<ffffffff8152e0b3>] ip_local_deliver_finish+0xa3/0x220
      
      This happens because rds_send_xmit() chain wants to take
      sock_lock which is already taken by tcp_v4_rcv() on its
      way to rds_tcp_data_ready(). Commit db6526dc ("RDS: use
      rds_send_xmit() state instead of RDS_LL_SEND_FULL") which
      was trying to opportunistically finish the send request
      in same thread context.
      
      But because of above recursive lock hang with RDS TCP,
      the send work from rds_send_pong() needs to deferred to
      worker to avoid lock up. Given RDS ping is more of connectivity
      test than performance critical path, its should be ok even
      for transport like IB.
      Reported-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
      Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b4b0009
    • J
      tunnels: Don't require remote endpoint or ID during creation. · e277de5f
      Jesse Gross 提交于
      Before lightweight tunnels existed, it really didn't make sense to
      create a tunnel that was not fully specified, such as without a
      destination IP address - the resulting packets would go nowhere.
      However, with lightweight tunnels, the opposite is true - it doesn't
      make sense to require this information when it will be provided later
      on by the route. This loosens the requirements for this information.
      
      An alternative would be to allow the relaxed version only when
      COLLECT_METADATA is enabled. However, since there are several
      variations on this theme (such as NBMA tunnels in GRE), just dropping
      the restrictions seems the most consistent across tunnels and with
      the existing configuration.
      
      CC: John Linville <linville@tuxdriver.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e277de5f
    • S
      uapi: add mpls_iptunnel.h · b3958b9e
      stephen hemminger 提交于
      Add missing rule to export mpls iptunnel header needed by iproute2
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3958b9e
    • E
      tcp: do not set queue_mapping on SYNACK · dc6ef6be
      Eric Dumazet 提交于
      At the time of commit fff32699 ("tcp: reflect SYN queue_mapping into
      SYNACK packets") we had little ways to cope with SYN floods.
      
      We no longer need to reflect incoming skb queue mappings, and instead
      can pick a TX queue based on cpu cooking the SYNACK, with normal XPS
      affinities.
      
      Note that all SYNACK retransmits were picking TX queue 0, this no longer
      is a win given that SYNACK rtx are now distributed on all cpus.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc6ef6be
    • J
      openvswitch: Scrub skb between namespaces · 740dbc28
      Joe Stringer 提交于
      If OVS receives a packet from another namespace, then the packet should
      be scrubbed. However, people have already begun to rely on the behaviour
      that skb->mark is preserved across namespaces, so retain this one field.
      
      This is mainly to address information leakage between namespaces when
      using OVS internal ports, but by placing it in ovs_vport_receive() it is
      more generally applicable, meaning it should not be overlooked if other
      port types are allowed to be moved into namespaces in future.
      Signed-off-by: NJoe Stringer <joestringer@nicira.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      740dbc28
    • D
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · a5d6f7dd
      David S. Miller 提交于
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth 2015-10-16
      
      First of all, sorry for the late set of patches for the 4.3 cycle. We
      just finished an intensive week of testing at the Bluetooth UnPlugFest
      and discovered (and fixed) issues there. Unfortunately a few issues
      affect 4.3-rc5 in a way that they break existing Bluetooth LE mouse and
      keyboard support.
      
      The regressions result from supporting LE privacy in conjunction with
      scanning for Resolvable Private Addresses before connecting. A feature
      that has been tested heavily (including automated unit tests), but sadly
      some regressions slipped in. The UnPlugFest with its multitude of test
      platforms is a good battle testing ground for uncovering every corner
      case.
      
      The patches in this pull request focus only on fixing the regressions in
      4.3-rc5. The patches look a bit larger since we also added comments in
      the critical sections of the fixes to improve clarity.
      
      I would appreciate if we can get these regression fixes to Linus
      quickly. Please let me know if there are any issues pulling. Thanks.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5d6f7dd
    • A
      net: hix5hd2_gmac: avoid integer overload warning · 951b5d95
      Arnd Bergmann 提交于
      BITS_RX_EN is an 'unsigned long' constant, so the ones complement of that
      has bits set that do not fit into a 32-bit variable on 64-bit architectures,
      which causes a harmless gcc warning:
      
      drivers/net/ethernet/hisilicon/hix5hd2_gmac.c: In function 'hix5hd2_port_disable':
      drivers/net/ethernet/hisilicon/hix5hd2_gmac.c:374:2: warning: large integer implicitly truncated to unsigned type [-Woverflow]
        writel_relaxed(~(BITS_RX_EN | BITS_TX_EN), priv->base + PORT_EN);
      
      This adds a cast to (u32) to tell gcc that the code is indeed fine.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      951b5d95
    • A
      net: hisilicon: add OF dependency · 876133d3
      Arnd Bergmann 提交于
      The HNS MDIO driver fails to build on older ARM machines that are not
      yet converted to CONFIG_OF:
      
      drivers/net/ethernet/hisilicon/hns_mdio.c: In function 'hns_mdio_bus_name':
      drivers/net/ethernet/hisilicon/hns_mdio.c:405:14: error: 'OF_BAD_ADDR' undeclared (first use in this function)
        u64 taddr = OF_BAD_ADDR;
                    ^
      drivers/net/ethernet/hisilicon/hns_mdio.c:405:14: note: each undeclared identifier is reported only once for each function it appears in
      drivers/net/ethernet/hisilicon/hns_mdio.c:409:11: error: implicit declaration of function 'of_translate_address' [-Werror=implicit-function-declaration]
         taddr = of_translate_address(np, addr);
                 ^
      
      This clarifies the dependency to ensure we don't attempt to build these
      drivers without CONFIG_OF, but also adds a COMPILE_TEST alternative to
      give us better build coverage testing.
      
      Build-tested on x86 as well to ensure this actually works.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      876133d3
    • A
      net: hisilicon: include linux/vmalloc.h in dsaf · 119c7ad8
      Arnd Bergmann 提交于
      Some configurations fail to build the hns dsaf code because of
      a missing header file:
      
      ethernet/hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_init':
      ethernet/hisilicon/hns/hns_dsaf_main.c:1096:2: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration]
        priv->soft_mac_tbl = vzalloc(sizeof(*priv->soft_mac_tbl)
      
      This adds the correct #include.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      119c7ad8
    • D
      Merge branch 'hns-fixes' · a679dbbb
      David S. Miller 提交于
      yankejian says:
      
      ====================
      net: hns: fixes two bugs in hns driver
      
        This patchset fixes two bugs in hns driver.
        - fixes timeout when received pause frame from the connective ports
        - should be set by using ethtool -s when the devices are link down
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a679dbbb
    • L
      net: hns: fixes a bug about timeout by pause frame · 90a505b9
      lisheng 提交于
      this patch fixes the bug triggered timeout sequence. when the connective
      ports cannot accept the packets with higher speed, they will send out the
      pause frame to the Soc's mac. At that time, the driver resets the relevant
      of the Soc, then it causes the packets cannot be sent out immediately.
      this patch fixes the issue.
      Signed-off-by: Nyankejian <yankejian@huawei.com>
      Signed-off-by: NYisen Zhuang <yisen.zhuang@huawei.com>
      Signed-off-by: Nlisheng <lisheng011@huawei.com>
      Signed-off-by: Nlipeng <lipeng321@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90a505b9
    • C
      net: hns: fixes the issue by using ethtool -s · 20ddb1d3
      Chenny Xu 提交于
        before this patch, hns driver only permits user to set the net device
      by using ethtool -s when the device is link up. it is obviously not so
      good. it needs to be set no matter it is link up or down. so this patch
      fixes this issue.
      Signed-off-by: Nyankejian <yankejian@huawei.com>
      Signed-off-by: NYisen Zhuang <yisen.zhuang@huawei.com>
      Signed-off-by: Nlisheng <lisheng011@huawei.com>
      Signed-off-by: Nlipeng <lipeng321@huawei.com>
      Signed-off-by: NChenny Xu <chenny.xu@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20ddb1d3
    • D
      Merge branch 'hsi-fixes' · 4639a3b5
      David S. Miller 提交于
      huangdaode says:
      
      ====================
      net: hisilicon fix some bugs in HNS drivers
      
      This patchset fixes the two bugs in HNS driver, one is remove the hnae sysfs interface
      according to the review comments from Arnd Bergmann <arnd@arndb.de>, another
      is fixing the wrong mac_id judgement bug which is found during internal tests.
      
      change log:
      v3:
       remove the hnae sysfs interface.
      
      v2:
        1) remove first bug fix, which is fixed in another patch submitted by
           Arnd Bergmann <arnd@arndb.de>
        2) change the code sytyle according to Joe.
      
      v1:
       initial version.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4639a3b5