1. 26 6月, 2020 16 次提交
    • M
      sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket · 471e39df
      Marcelo Ricardo Leitner 提交于
      If a socket is set ipv6only, it will still send IPv4 addresses in the
      INIT and INIT_ACK packets. This potentially misleads the peer into using
      them, which then would cause association termination.
      
      The fix is to not add IPv4 addresses to ipv6only sockets.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: NCorey Minyard <cminyard@mvista.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Tested-by: NCorey Minyard <cminyard@mvista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      471e39df
    • B
      tc-testing: avoid action cookies with odd length. · b6186d41
      Briana Oursler 提交于
      Update odd length cookie hexstrings in csum.json, tunnel_key.json and
      bpf.json to be even length to comply with check enforced in commit
      0149dabf2a1b ("tc: m_actions: check cookie hexstring len") in iproute2.
      Signed-off-by: NBriana Oursler <briana.oursler@gmail.com>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6186d41
    • D
      Merge branch 'tcp_cubic-fix-spurious-HYSTART_DELAY-on-RTT-decrease' · 3b0e7dc0
      David S. Miller 提交于
      Neal Cardwell says:
      
      ====================
      tcp_cubic: fix spurious HYSTART_DELAY on RTT decrease
      
      This series fixes a long-standing bug in the TCP CUBIC
      HYSTART_DELAY mechanim recently reported by Mirja Kuehlewind. The
      code can cause a spurious exit of slow start in some particular
      cases: upon an RTT decrease that happens on the 9th or later ACK
      in a round trip. This series fixes the original Hystart code and
      also the recent BPF implementation.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b0e7dc0
    • N
      bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT · 7d21d54d
      Neal Cardwell 提交于
      Apply the fix from:
       "tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT"
      to the BPF implementation of TCP CUBIC congestion control.
      
      Repeating the commit description here for completeness:
      
      Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
      Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
      ACK when the minimum rtt of a connection goes down. From inspection it
      is clear from the existing code that this could happen in an example
      like the following:
      
      o The first 8 RTT samples in a round trip are 150ms, resulting in a
        curr_rtt of 150ms and a delay_min of 150ms.
      
      o The 9th RTT sample is 100ms. The curr_rtt does not change after the
        first 8 samples, so curr_rtt remains 150ms. But delay_min can be
        lowered at any time, so delay_min falls to 100ms. The code executes
        the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
        of 100ms, and the curr_rtt is declared far enough above delay_min to
        force a (spurious) exit of Slow start.
      
      The fix here is simple: allow every RTT sample in a round trip to
      lower the curr_rtt.
      
      Fixes: 6de4a9c4 ("bpf: tcp: Add bpf_cubic example")
      Reported-by: NMirja Kuehlewind <mirja.kuehlewind@ericsson.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d21d54d
    • N
      tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT · b344579c
      Neal Cardwell 提交于
      Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
      Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
      ACK when the minimum rtt of a connection goes down. From inspection it
      is clear from the existing code that this could happen in an example
      like the following:
      
      o The first 8 RTT samples in a round trip are 150ms, resulting in a
        curr_rtt of 150ms and a delay_min of 150ms.
      
      o The 9th RTT sample is 100ms. The curr_rtt does not change after the
        first 8 samples, so curr_rtt remains 150ms. But delay_min can be
        lowered at any time, so delay_min falls to 100ms. The code executes
        the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
        of 100ms, and the curr_rtt is declared far enough above delay_min to
        force a (spurious) exit of Slow start.
      
      The fix here is simple: allow every RTT sample in a round trip to
      lower the curr_rtt.
      
      Fixes: ae27e98a ("[TCP] CUBIC v2.3")
      Reported-by: NMirja Kuehlewind <mirja.kuehlewind@ericsson.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b344579c
    • D
      Merge branch 'Fixes-for-SJA1105-DSA-tc-gate-action' · 29a30bac
      David S. Miller 提交于
      Vladimir Oltean says:
      
      ====================
      Fixes for SJA1105 DSA tc-gate action
      
      This small series fixes 2 bugs in the tc-gate implementation:
      1. The TAS state machine keeps getting rescheduled even after removing
         tc-gate actions on all ports.
      2. tc-gate actions with only one gate control list entry are installed
         to hardware with an incorrect interval of zero, which makes the
         switch erroneously drop those packets (since the configuration is
         invalid).
      
      To keep the code palatable, a forward-declaration was avoided by moving
      some code around in patch 1/4. I hope that isn't too much of an issue.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29a30bac
    • V
      net: dsa: sja1105: fix tc-gate schedule with single element · 43ce887c
      Vladimir Oltean 提交于
      The sja1105_gating_cfg_time_to_interval function does this, as per the
      comments:
      
      /* The gate entries contain absolute times in their e->interval field. Convert
       * that to proper intervals (i.e. "0, 5, 10, 15" to "5, 5, 5, 5").
       */
      
      To perform that task, it iterates over gating_cfg->entries, at each step
      updating the interval of the _previous_ entry. So one interval remains
      to be updated at the end of the loop: the last one (since it isn't
      "prev" for anyone else).
      
      But there was an erroneous check, that the last element's interval
      should not be updated if it's also the only element. I'm not quite sure
      why that check was there, but it's clearly incorrect, as a tc-gate
      schedule with a single element would get an e->interval of zero,
      regardless of the duration requested by the user. The switch wouldn't
      even consider this configuration as valid: it will just drop all traffic
      that matches the rule.
      
      Fixes: 834f8933 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
      Reported-by: NXiaoliang Yang <xiaoliang.yang_1@nxp.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43ce887c
    • V
      net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules · 82f6896a
      Vladimir Oltean 提交于
      Currently, tas_data->enabled would remain true even after deleting all
      tc-gate rules from the switch ports, which would cause the
      sja1105_tas_state_machine to get unnecessarily scheduled.
      
      Also, if there were any errors which would prevent the hardware from
      enabling the gating schedule, the sja1105_tas_state_machine would
      continuously detect and print that, spamming the kernel log, even if the
      rules were subsequently deleted.
      
      The rules themselves are _not_ active, because sja1105_init_scheduling
      does enough of a job to not install the gating schedule in the static
      config. But the virtual link rules themselves are still present.
      
      So call the functions that remove the tc-gate configuration from
      priv->tas_data.gating_cfg, so that tas_data->enabled can be set to
      false, and sja1105_tas_state_machine will stop from being scheduled.
      
      Fixes: 834f8933 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82f6896a
    • V
      net: dsa: sja1105: unconditionally free old gating config · 026bdb2b
      Vladimir Oltean 提交于
      Currently sja1105_compose_gating_subschedule is not prepared to be
      called for the case where we want to recompute the global tc-gate
      configuration after we've deleted those actions on a port.
      
      After deleting the tc-gate actions on the last port, max_cycle_time
      would become zero, and that would incorrectly prevent
      sja1105_free_gating_config from getting called.
      
      So move the freeing function above the check for the need to apply a new
      configuration.
      
      Fixes: 834f8933 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      026bdb2b
    • V
      net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top · e39109f5
      Vladimir Oltean 提交于
      It turns out that sja1105_compose_gating_subschedule must also be called
      from sja1105_vl_delete, to recalculate the overall tc-gate
      configuration. Currently this is not possible without introducing a
      forward declaration. So move the function at the top of the file, along
      with its dependencies.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e39109f5
    • C
      net: macb: free resources on failure path of at91ether_open() · 33fdef24
      Claudiu Beznea 提交于
      DMA buffers were not freed on failure path of at91ether_open().
      Along with changes for freeing the DMA buffers the enable/disable
      interrupt instructions were moved to at91ether_start()/at91ether_stop()
      functions and the operations on at91ether_stop() were done in
      their reverse order (compared with how is done in at91ether_start()):
      before this patch the operation order on interface open path
      was as follows:
      1/ alloc DMA buffers
      2/ enable tx, rx
      3/ enable interrupts
      and the order on interface close path was as follows:
      1/ disable tx, rx
      2/ disable interrupts
      3/ free dma buffers.
      
      Fixes: 7897b071 ("net: macb: convert to phylink")
      Signed-off-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33fdef24
    • C
      net: macb: call pm_runtime_put_sync on failure path · 0eaf228d
      Claudiu Beznea 提交于
      Call pm_runtime_put_sync() on failure path of at91ether_open.
      
      Fixes: e6a41c23 ("net: macb: ensure interface is not suspended on at91rm9200")
      Signed-off-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0eaf228d
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · f4926d51
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net, they are:
      
      1) Unaligned atomic access in ipset, from Russell King.
      
      2) Missing module description, from Rob Gill.
      
      3) Patches to fix a module unload causing NULL pointer dereference in
         xtables, from David Wilder. For the record, I posting here his cover
         letter explaining the problem:
      
          A crash happened on ppc64le when running ltp network tests triggered by
          "rmmod iptable_mangle".
      
          See previous discussion in this thread:
          https://lists.openwall.net/netdev/2020/06/03/161 .
      
          In the crash I found in iptable_mangle_hook() that
          state->net->ipv4.iptable_mangle=NULL causing a NULL pointer dereference.
          net->ipv4.iptable_mangle is set to NULL in +iptable_mangle_net_exit() and
          called when ip_mangle modules is unloaded. A rmmod task was found running
          in the crash dump.  A 2nd crash showed the same problem when running
          "rmmod iptable_filter" (net->ipv4.iptable_filter=NULL).
      
          To fix this I added .pre_exit hook in all iptable_foo.c. The pre_exit will
          un-register the underlying hook and exit would do the table freeing. The
          netns core does an unconditional +synchronize_rcu after the pre_exit hooks
          insuring no packets are in flight that have picked up the pointer before
          completing the un-register.
      
          These patches include changes for both iptables and ip6tables.
      
          We tested this fix with ltp running iptables01.sh and iptables01.sh -6 a
          loop for 72 hours.
      
      4) Add a selftest for conntrack helper assignment, from Florian Westphal.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4926d51
    • T
      net: bridge: enfore alignment for ethernet address · 206e7323
      Thomas Martitz 提交于
      The eth_addr member is passed to ether_addr functions that require
      2-byte alignment, therefore the member must be properly aligned
      to avoid unaligned accesses.
      
      The problem is in place since the initial merge of multicast to unicast:
      commit 6db6f0ea bridge: multicast to unicast
      
      Fixes: 6db6f0ea ("bridge: multicast to unicast")
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Felix Fietkau <nbd@nbd.name>
      Signed-off-by: NThomas Martitz <t.martitz@avm.de>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      206e7323
    • D
      tcp: don't ignore ECN CWR on pure ACK · 25702840
      Denis Kirjanov 提交于
      there is a problem with the CWR flag set in an incoming ACK segment
      and it leads to the situation when the ECE flag is latched forever
      
      the following packetdrill script shows what happens:
      
      // Stack receives incoming segments with CE set
      +0.1 <[ect0]  . 11001:12001(1000) ack 1001 win 65535
      +0.0 <[ce]    . 12001:13001(1000) ack 1001 win 65535
      +0.0 <[ect0] P. 13001:14001(1000) ack 1001 win 65535
      
      // Stack repsonds with ECN ECHO
      +0.0 >[noecn]  . 1001:1001(0) ack 12001
      +0.0 >[noecn] E. 1001:1001(0) ack 13001
      +0.0 >[noecn] E. 1001:1001(0) ack 14001
      
      // Write a packet
      +0.1 write(3, ..., 1000) = 1000
      +0.0 >[ect0] PE. 1001:2001(1000) ack 14001
      
      // Pure ACK received
      +0.01 <[noecn] W. 14001:14001(0) ack 2001 win 65535
      
      // Since CWR was sent, this packet should NOT have ECE set
      
      +0.1 write(3, ..., 1000) = 1000
      +0.0 >[ect0]  P. 2001:3001(1000) ack 14001
      // but Linux will still keep ECE latched here, with packetdrill
      // flagging a missing ECE flag, expecting
      // >[ect0] PE. 2001:3001(1000) ack 14001
      // in the script
      
      In the situation above we will continue to send ECN ECHO packets
      and trigger the peer to reduce the congestion window. To avoid that
      we can check CWR on pure ACKs received.
      
      v3:
      - Add a sequence check to avoid sending an ACK to an ACK
      
      v2:
      - Adjusted the comment
      - move CWR check before checking for unacknowledged packets
      Signed-off-by: NDenis Kirjanov <denis.kirjanov@suse.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25702840
    • A
      net: phy: mscc: avoid skcipher API for single block AES encryption · 5a3235e5
      Ard Biesheuvel 提交于
      The skcipher API dynamically instantiates the transformation object
      on request that implements the requested algorithm optimally on the
      given platform. This notion of optimality only matters for cases like
      bulk network or disk encryption, where performance can be a bottleneck,
      or in cases where the algorithm itself is not known at compile time.
      
      In the mscc case, we are dealing with AES encryption of a single
      block, and so neither concern applies, and we are better off using
      the AES library interface, which is lightweight and safe for this
      kind of use.
      
      Note that the scatterlist API does not permit references to buffers
      that are located on the stack, so the existing code is incorrect in
      any case, but avoiding the skcipher and scatterlist APIs entirely is
      the most straight-forward approach to fixing this.
      
      Cc: Antoine Tenart <antoine.tenart@bootlin.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Fixes: 28c5107a ("net: phy: mscc: macsec support")
      Reviewed-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Tested-by: NAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a3235e5
  2. 25 6月, 2020 19 次提交
  3. 24 6月, 2020 5 次提交