1. 13 5月, 2012 15 次提交
    • J
      etherdevice: Remove now unused compare_ether_addr_64bits · e550ba1a
      Joe Perches 提交于
      Move and invert the logic from the otherwise unused
      compare_ether_addr_64bits to ether_addr_equal_64bits.
      
      Neaten the logic in is_etherdev_addr.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e550ba1a
    • E
      fq_codel: Fair Queue Codel AQM · 4b549a2e
      Eric Dumazet 提交于
      Fair Queue Codel packet scheduler
      
      Principles :
      
      - Packets are classified (internal classifier or external) on flows.
      - This is a Stochastic model (as we use a hash, several flows might
                                    be hashed on same slot)
      - Each flow has a CoDel managed queue.
      - Flows are linked onto two (Round Robin) lists,
        so that new flows have priority on old ones.
      
      - For a given flow, packets are not reordered (CoDel uses a FIFO)
      - head drops only.
      - ECN capability is on by default.
      - Very low memory footprint (64 bytes per flow)
      
      tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                            [ target TIME ] [ interval TIME ] [ noecn ]
                            [ quantum BYTES ]
      
      defaults : 1024 flows, 10240 packets limit, quantum : device MTU
                 target : 5ms (CoDel default)
                 interval : 100ms (CoDel default)
      
      Impressive results on load :
      
      class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0
       Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0)
       rate 201691Kbit 28595pps backlog 0b 312p requeues 0
       lended: 33063109 borrowed: 0 giants: 0
       tokens: -912 ctokens: -912
      
      class fq_codel 10:1735 parent 10:
       (dropped 1292, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:4524 parent 10:
       (dropped 1291, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:4e74 parent 10:
       (dropped 1290, overlimits 0 requeues 0)
       backlog 6056b 4p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms
      class fq_codel 10:628a parent 10:
       (dropped 1289, overlimits 0 requeues 0)
       backlog 7570b 5p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms
      class fq_codel 10:a4b3 parent 10:
       (dropped 302, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:c3c2 parent 10:
       (dropped 1284, overlimits 0 requeues 0)
       backlog 13626b 9p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.9ms
      class fq_codel 10:d331 parent 10:
       (dropped 299, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.0ms
      class fq_codel 10:d526 parent 10:
       (dropped 12160, overlimits 0 requeues 0)
       backlog 35870b 211p requeues 0
        deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us
      class fq_codel 10:e2c6 parent 10:
       (dropped 1288, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:eab5 parent 10:
       (dropped 1285, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.9ms
      class fq_codel 10:f220 parent 10:
       (dropped 1289, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      
      qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
       Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71)
       rate 201697Kbit 28602pps backlog 0b 260p requeues 71
      qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn
       Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0)
       rate 201697Kbit 28602pps backlog 189352b 260p requeues 0
        maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593
        new_flows_len 0 old_flows_len 11
      
      PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
      64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
      64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
      64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
      64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
      64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
      64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
      64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
      64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
      64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
      64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms
      
      10 packets transmitted, 10 received, 0% packet loss, time 8999ms
      rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms
      
      Much better than SFQ because of priority given to new flows, and fast
      path dirtying less cache lines.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b549a2e
    • L
      usb/net: rndis: move bus message definition · d5543206
      Linus Walleij 提交于
      This moves the bus message definition to land together with the
      other message types. This message is not used in the kernel but
      I'm keeping it anyway.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5543206
    • L
      usb/net: rndis: fixup a few name prefixes · e20289ed
      Linus Walleij 提交于
      This switches a horde of NDIS_*-prefixed variables to the RNDIS_*
      prefix. Most of them aren't used much and causes no changes.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e20289ed
    • L
      usb/net: rndis: merge command codes · 51491167
      Linus Walleij 提交于
      Switch the hyperv filter and rndis gadget driver to use the same command
      enumerators as the other drivers and delete the surplus command codes.
      Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51491167
    • L
      usb/net: rndis: move and namespace PnP defines · c80174f3
      Linus Walleij 提交于
      This moves the PnP OID definitions to the RNDIS_* namespace
      and puts them in the next falling slot in the list. Oh, the comment
      above the PnP defines was referring to some obsolete or out-of-tree
      driver so removed it, and removed my own comments telling where each
      header segment came from as well, we have moved everything around by
      this point anyway.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c80174f3
    • L
      usb/net: rndis: delete duplicate packet types · b1019432
      Linus Walleij 提交于
      The NDIS_*-prefixed packet types have equivalent RNDIS_*-
      prefixed types, besides nothing in the kernel use these defines.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1019432
    • L
      usb/net: rndis: merge media type definitions · 17c51b6c
      Linus Walleij 提交于
      Let's have a unified table of RNDIS media. We used to have a similar
      table with NDIS_* prefix from the gadget driver, but since we're only
      using RNDIS in the kernel (IIRC NDIS, non-remote, is for the windows-
      internal network drivers so what do we care) let's prefix everything
      with RNDIS. Some of the definitions were conflicting, in one of the
      defines 0x0B is bearer "CO WAN" and in two others "BPC". Well I took
      the majority vote. Two definition of medium 0x09 calls it "wireless
      WAN" but one vote for "wireless LAN" but in this case I am sticking
      with the minority, "Wide Area Network" does not make much sense in
      this case as far as I can tell.
      
      NOTE: latin singular and plural is so screwed up in these defines
      that it makes my eyes bleed. But I will not attempt to submit a
      patch converting all use of _MEDIA_ to _MEDIUM_ while I can probably
      tell from the semantics of the code that RNDIS_MEDIA_STATE_CONNECTED
      is most probably (erroneously) referring to a singular, unless it
      can return an array of connected media. I suspect these erroneous
      plurals are used in documentation and such so I don't want to
      mess around with things for no functional change.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17c51b6c
    • L
      usb/net: rndis: group all status codes together · 91d6aef7
      Linus Walleij 提交于
      Move all RNDIS status codes so they appear in rising order and
      in one place of the header file.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91d6aef7
    • L
      usb/net: rndis: delete surplus defines · c3ef5eae
      Linus Walleij 提交于
      These defines are not used in the kernel, and they have duplicate
      definitions under the RNDIS_* prefix.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3ef5eae
    • L
      usb/net: rndis: merge duplicate 802_* OIDs · 4cc6c4d5
      Linus Walleij 提交于
      The 802_* network OIDs were duplicated, so let's merge them and
      use the RNDIS_* prefixed definitions from the hyperV driver.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cc6c4d5
    • L
      usb/net: rndis: eliminate first set of duplicate OIDs · 8cdddc3f
      Linus Walleij 提交于
      The RNDIS protocol contains a vast number of Object ID:s (OIDs).
      The current definitions had multiple definitions of these ID:s,
      let's use the nicely RNDIS_*-prefixed defines from the HyperV
      implementation, rename everywhere they're used, and copy+rename
      the few that were missing from this list of objects.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cdddc3f
    • L
      usb/net: rndis: remove ambigous status codes · 007e5c8e
      Linus Walleij 提交于
      The RNDIS status codes are redefined with much stranged ifdeffery
      and only one of these codes was used in the hyperv driver, and
      there it is very clearly referring to the RNDIS variant, not some
      other status. So clarify this by explictly using the RNDIS_*
      prefixed status code in the hyperv drivera and delete the
      duplicate defines.
      Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      007e5c8e
    • L
      usb/net: rndis: break out <linux/rndis.h> defines · 7591157e
      Linus Walleij 提交于
      As a first step to consolidate the RNDIS implementations, break out
      a common file with all the #defines and move it to <linux/rndis.h>.
      
      This also deletes the immediate duplicated defines in the
      <linux/rndis.h> file that yields a lot of compilation warnings.
      Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7591157e
    • L
      usb/net: rndis: inline the cpu_to_le32() macro · 7390e8b0
      Linus Walleij 提交于
      The header file <linux/usb/rndis_host.h> used a number of #defines
      that included the cpu_to_le32() macro to assure the result will be
      in LE endianness. Inlining this into the code instead of using it
      in the code definitions yields consolidation opportunities later
      on as you will see in the following patches. The individual
      drivers also used local defines - all are switched over to the
      pattern of doing the conversion at the call sites instead.
      Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7390e8b0
  2. 11 5月, 2012 2 次提交
    • E
      codel: Controlled Delay AQM · 76e3cc12
      Eric Dumazet 提交于
      An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson.
      
      http://queue.acm.org/detail.cfm?id=2209336
      
      This AQM main input is no longer queue size in bytes or packets, but the
      delay packets stay in (FIFO) queue.
      
      As we don't have infinite memory, we still can drop packets in enqueue()
      in case of massive load, but mean of CoDel is to drop packets in
      dequeue(), using a control law based on two simple parameters :
      
      target : target sojourn time (default 5ms)
      interval : width of moving time window (default 100ms)
      
      Based on initial work from Dave Taht.
      
      Refactored to help future codel inclusion as a plugin for other linux
      qdisc (FQ_CODEL, ...), like RED.
      
      include/net/codel.h contains codel algorithm as close as possible than
      Kathleen reference.
      
      net/sched/sch_codel.c contains the linux qdisc specific glue.
      
      Separate structures permit a memory efficient implementation of fq_codel
      (to be sent as a separate work) : Each flow has its own struct
      codel_vars.
      
      timestamps are taken at enqueue() time with 1024 ns precision, allowing
      a range of 2199 seconds in queue, and 100Gb links support. iproute2 uses
      usec as base unit.
      
      Selected packets are dropped, unless ECN is enabled and packets can get
      ECN mark instead.
      
      Tested from 2Mb to 10Gb speeds with no particular problems, on ixgbe and
      tg3 drivers (BQL enabled).
      
      Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ]
                                [ interval TIME ] [ ecn ]
      
      qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn
       Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0)
       rate 202365Kbit 16708pps backlog 113550b 75p requeues 0
        count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us
        maxpacket 1514 ecn_mark 84399 drop_overlimit 0
      
      CoDel must be seen as a base module, and should be used keeping in mind
      there is still a FIFO queue. So a typical setup will probably need a
      hierarchy of several qdiscs and packet classifiers to be able to meet
      whatever constraints a user might have.
      
      One possible example would be to use fq_codel, which combines Fair
      Queueing and CoDel, in replacement of sfq / sfq_red.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDave Taht <dave.taht@bufferbloat.net>
      Cc: Kathleen Nichols <nichols@pollere.com>
      Cc: Van Jacobson <van@pollere.net>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Matt Mathis <mattmathis@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76e3cc12
    • J
      etherdevice.h: Add ether_addr_equal_64bits · baf523c9
      Joe Perches 提交于
      Add an optimized boolean function to check if
      2 ethernet addresses are the same.
      
      This is to avoid any confusion about compare_ether_addr_64bits
      returning an unsigned, and not being able to use the
      compare_ether_addr_64bits function for sorting ala memcmp.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      baf523c9
  3. 10 5月, 2012 2 次提交
  4. 09 5月, 2012 7 次提交
    • F
      netfilter: hashlimit: byte-based limit mode · 0197dee7
      Florian Westphal 提交于
      can be used e.g. for ingress traffic policing or
      to detect when a host/port consumes more bandwidth than expected.
      
      This is done by optionally making cost to mean
      "cost per 16-byte-chunk-of-data" instead of "cost per packet".
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      0197dee7
    • H
      netfilter: add xt_hmark target for hash-based skb marking · cf308a1f
      Hans Schillstrom 提交于
      The target allows you to create rules in the "raw" and "mangle" tables
      which set the skbuff mark by means of hash calculation within a given
      range. The nfmark can influence the routing method (see "Use netfilter
      MARK value as routing key") and can also be used by other subsystems to
      change their behaviour.
      
      [ Part of this patch has been refactorized and modified by Pablo Neira Ayuso ]
      Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      cf308a1f
    • H
      netfilter: ip6_tables: add flags parameter to ipv6_find_hdr() · 84018f55
      Hans Schillstrom 提交于
      This patch adds the flags parameter to ipv6_find_hdr. This flags
      allows us to:
      
      * know if this is a fragment.
      * stop at the AH header, so the information contained in that header
        can be used for some specific packet handling.
      
      This patch also adds the offset parameter for inspection of one
      inner IPv6 header that is contained in error messages.
      Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      84018f55
    • P
      netfilter: remove ip_queue support · d16cf20e
      Pablo Neira Ayuso 提交于
      This patch removes ip_queue support which was marked as obsolete
      years ago. The nfnetlink_queue modules provides more advanced
      user-space packet queueing mechanism.
      
      This patch also removes capability code included in SELinux that
      refers to ip_queue. Otherwise, we break compilation.
      
      Several warning has been sent regarding this to the mailing list
      in the past month without anyone rising the hand to stop this
      with some strong argument.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d16cf20e
    • P
      netfilter: nf_conntrack: fix explicit helper attachment and NAT · 6714cf54
      Pablo Neira Ayuso 提交于
      Explicit helper attachment via the CT target is broken with NAT
      if non-standard ports are used. This problem was hidden behind
      the automatic helper assignment routine. Thus, it becomes more
      noticeable now that we can disable the automatic helper assignment
      with Eric Leblond's:
      
      9e8ac5a netfilter: nf_ct_helper: allow to disable automatic helper assignment
      
      Basically, nf_conntrack_alter_reply asks for looking up the helper
      up if NAT is enabled. Unfortunately, we don't have the conntrack
      template at that point anymore.
      
      Since we don't want to rely on the automatic helper assignment,
      we can skip the second look-up and stick to the helper that was
      attached by iptables. With the CT target, the user is in full
      control of helper attachment, thus, the policy is to trust what
      the user explicitly configures via iptables (no automatic magic
      anymore).
      
      Interestingly, this bug was hidden by the automatic helper look-up
      code. But it can be easily trigger if you attach the helper in
      a non-standard port, eg.
      
      iptables -I PREROUTING -t raw -p tcp --dport 8888 \
      	-j CT --helper ftp
      
      And you disabled the automatic helper assignment.
      
      I added the IPS_HELPER_BIT that allows us to differenciate between
      a helper that has been explicitly attached and those that have been
      automatically assigned. I didn't come up with a better solution
      (having backward compatibility in mind).
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6714cf54
    • J
      ipvs: always update some of the flags bits in backup · cdcc5e90
      Julian Anastasov 提交于
      	As the goal is to mirror the inactconns/activeconns
      counters in the backup server, make sure the cp->flags are
      updated even if cp is still not bound to dest. If cp->flags
      are not updated ip_vs_bind_dest will rely only on the initial
      flags when updating the counters. To avoid mistakes and
      complicated checks for protocol state rely only on the
      IP_VS_CONN_F_INACTIVE bit when updating the counters.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Tested-by: NAleksey Chudov <aleksey.chudov@gmail.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      cdcc5e90
    • J
      etherdev.h: Convert int is_<foo>_ether_addr to bool · b44907e6
      Joe Perches 提交于
      Make the return value explicitly true or false.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b44907e6
  5. 08 5月, 2012 3 次提交
    • D
      netdev/of/phy: Add MDIO bus multiplexer support. · 0ca2997d
      David Daney 提交于
      This patch adds a somewhat generic framework for MDIO bus
      multiplexers.  It is modeled on the I2C multiplexer.
      
      The multiplexer is needed if there are multiple PHYs with the same
      address connected to the same MDIO bus adepter, or if there is
      insufficient electrical drive capability for all the connected PHY
      devices.
      
      Conceptually it could look something like this:
      
                         ------------------
                         | Control Signal |
                         --------+---------
                                 |
       ---------------   --------+------
       | MDIO MASTER |---| Multiplexer |
       ---------------   --+-------+----
                           |       |
                           C       C
                           h       h
                           i       i
                           l       l
                           d       d
                           |       |
           ---------       A       B   ---------
           |       |       |       |   |       |
           | PHY@1 +-------+       +---+ PHY@1 |
           |       |       |       |   |       |
           ---------       |       |   ---------
           ---------       |       |   ---------
           |       |       |       |   |       |
           | PHY@2 +-------+       +---+ PHY@2 |
           |       |                   |       |
           ---------                   ---------
      
      This framework configures the bus topology from device tree data.  The
      mechanics of switching the multiplexer is left to device specific
      drivers.
      
      The follow-on patch contains a multiplexer driven by GPIO lines.
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ca2997d
    • D
      netdev/of/phy: New function: of_mdio_find_bus(). · 25106022
      David Daney 提交于
      Add of_mdio_find_bus() which allows an mii_bus to be located given its
      associated the device tree node.
      
      This is needed by the follow-on patch to add a driver for MDIO bus
      multiplexers.
      
      The of_mdiobus_register() function is modified so that the device tree
      node is recorded in the mii_bus.  Then we can find it again by
      iterating over all mdio_bus_class devices.
      
      Because the OF device tree has now become an integral part of the
      kernel, this can live in mdio_bus.c (which contains the needed
      mdio_bus_class structure) instead of of_mdio.c.
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25106022
    • J
      net: compare_ether_addr[_64bits]() has no ordering · 1c430a72
      Johannes Berg 提交于
      Neither compare_ether_addr() nor compare_ether_addr_64bits()
      (as it can fall back to the former) have comparison semantics
      like memcmp() where the sign of the return value indicates sort
      order. We had a bug in the wireless code due to a blind memcmp
      replacement because of this.
      
      A cursory look suggests that the wireless bug was the only one
      due to this semantic difference.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c430a72
  6. 07 5月, 2012 1 次提交
  7. 04 5月, 2012 4 次提交
  8. 03 5月, 2012 2 次提交
    • Y
      tcp: early retransmit: delayed fast retransmit · 750ea2ba
      Yuchung Cheng 提交于
      Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2).
      Delays the fast retransmit by an interval of RTT/4. We borrow the
      RTO timer to implement the delay. If we receive another ACK or send
      a new packet, the timer is cancelled and restored to original RTO
      value offset by time elapsed.  When the delayed-ER timer fires,
      we enter fast recovery and perform fast retransmit.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      750ea2ba
    • Y
      tcp: early retransmit · eed530b6
      Yuchung Cheng 提交于
      This patch implements RFC 5827 early retransmit (ER) for TCP.
      It reduces DUPACK threshold (dupthresh) if outstanding packets are
      less than 4 to recover losses by fast recovery instead of timeout.
      
      While the algorithm is simple, small but frequent network reordering
      makes this feature dangerous: the connection repeatedly enter
      false recovery and degrade performance. Therefore we implement
      a mitigation suggested in the appendix of the RFC that delays
      entering fast recovery by a small interval, i.e., RTT/4. Currently
      ER is conservative and is disabled for the rest of the connection
      after the first reordering event. A large scale web server
      experiment on the performance impact of ER is summarized in
      section 6 of the paper "Proportional Rate Reduction for TCP”,
      IMC 2011. http://conferences.sigcomm.org/imc/2011/docs/p155.pdf
      
      Note that Linux has a similar feature called THIN_DUPACK. The
      differences are THIN_DUPACK do not mitigate reorderings and is only
      used after slow start. Currently ER is disabled if THIN_DUPACK is
      enabled. I would be happy to merge THIN_DUPACK feature with ER if
      people think it's a good idea.
      
      ER is enabled by sysctl_tcp_early_retrans:
        0: Disables ER
      
        1: Reduce dupthresh to packets_out - 1 when outstanding packets < 4.
      
        2: (Default) reduce dupthresh like mode 1. In addition, delay
           entering fast recovery by RTT/4.
      
      Note: mode 2 is implemented in the third part of this patch series.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eed530b6
  9. 01 5月, 2012 4 次提交