1. 12 7月, 2018 36 次提交
  2. 11 7月, 2018 4 次提交
    • P
      selftests: forwarding: mirror_lib: Tighten up VLAN capture · db560d16
      Petr Machata 提交于
      The function do_test_span_vlan_dir_ips() is used for testing whether
      mirrored packets are VLAN-encapsulated. But since it only considers
      VLAN encapsulation, it may end up matching unmirrored ARP traffic as
      well. One consequence is a rare failure of mirror_gre_vlan_bridge_1q's
      test_gretap_untagged_egress. Decreasing ping cadence in mirror_test()
      makes the problem easily reproducible.
      
      Therefore tighten up the match criterion to only count those 802.1q
      packets where the next header is IP.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db560d16
    • D
      Merge branch 'cake-qdisc' · 5025b99c
      David S. Miller 提交于
      Toke Høiland-Jørgensen says:
      
      ====================
      sched: Add Common Applications Kept Enhanced (cake) qdisc
      
      This patch series adds the CAKE qdisc, and has been split up to ease
      review.
      
      I have attempted to split out each configurable feature into its own patch.
      The first commit adds the base shaper and packet scheduler, while
      subsequent commits add the optional features. The full userspace API and
      most data structures are included in this commit, but options not
      understood in the base version will be ignored.
      
      The result of applying the entire series is identical to the out of tree
      version that have seen extensive testing in previous deployments, most
      notably as an out of tree patch to OpenWrt. However, note that I have only
      compile tested the individual patches; so the whole series should be
      considered as a unit.
      
      ---
      Changelog
      
      v19:
        - Rebase to current net-next.
        - Don't rely on the value of sch->q.qlen to break loops; fixes possible
          infinite loop on multi-queue devices.
        - Don't overwrite NAT flag when setting flow mode.
      
      v18:
        - Rework classification logic in the diffserv case to always hash if
          filter doesn't select a queue, and to run TC filters before
          selecting the diffserv tin (allowing filter to influence this).
        - Make sure we always call qdisc_watchdog_init() in cake_init(), so we
          don't crash in cake_destroy().
      
      v17:
        - Rebase to newest net-next and move the conntrack callback to
          nf_ct_hook
        - Fix a compile error when NF_CONNTRACK is unset.
      
      v16:
        - Move conntrack lookup function into conntrack core and read it via
          RCU so it is only active when the nf_conntrack module is loaded.
          This avoids the module dependency on conntrack for NAT mode. Thanks
          to Pablo for the idea.
      
      v15:
        - Handle ECN flags in ACK filter
      
      v14:
        - Handle seqno wraps and DSACKs in ACK filter
      
      v13:
        - Avoid ktime_t to scalar compares
        - Add class dumping and basic stats
        - Fail with ENOTSUPP when requesting NAT mode and conntrack is not
          available.
        - Parse all TCP options in ACK filter and make sure to only drop safe
          ones. Also handle SACK ranges properly.
      
      v12:
        - Get rid of custom time typedefs. Use ktime_t for time and u64 for
          duration instead.
      
      v11:
        - Fix overhead compensation calculation for GSO packets
        - Change configured rate to be u64 (I ran out of bits before I ran out
          of CPU when testing the effects of the above)
      
      v10:
        - Christmas tree gardening (fix variable declarations to be in reverse
          line length order)
      
      v9:
        - Remove duplicated checks around kvfree() and just call it
          unconditionally.
        - Don't pass __GFP_NOWARN when allocating memory
        - Move options in cake_dump() that are related to optional features to
          later patches implementing the features.
        - Support attaching filters to the qdisc and use the classification
          result to select flow queue.
        - Support overriding diffserv priority tin from skb->priority
      
      v8:
        - Remove inline keyword from function definitions
        - Simplify ACK filter; remove the complex state handling to make the
          logic easier to follow. This will potentially be a bit less efficient,
          but I have not been able to measure a difference.
      
      v7:
        - Split up patch into a series to ease review.
        - Constify the ACK filter.
      
      v6:
        - Fix 6in4 encapsulation checks in ACK filter code
        - Checkpatch fixes
      
      v5:
        - Refactor ACK filter code and hopefully fix the safety issues
          properly this time.
      
      v4:
        - Only split GSO packets if shaping at speeds <= 1Gbps
        - Fix overhead calculation code to also work for GSO packets
        - Don't re-implement kvzalloc()
        - Remove local header include from out-of-tree build (fixes kbuild-bot
          complaint).
        - Several fixes to the ACK filter:
          - Check pskb_may_pull() before deref of transport headers.
          - Don't run ACK filter logic on split GSO packets
          - Fix TCP sequence number compare to deal with wraparounds
      
      v3:
        - Use IS_REACHABLE() macro to fix compilation when sch_cake is
          built-in and conntrack is a module.
        - Switch the stats output to use nested netlink attributes instead
          of a versioned struct.
        - Remove GPL boilerplate.
        - Fix array initialisation style.
      
      v2:
        - Fix kbuild test bot complaint
        - Clean up the netlink ABI
        - Fix checkpatch complaints
        - A few tweaks to the behaviour of cake based on testing carried out
          while writing the paper.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5025b99c
    • T
      sch_cake: Conditionally split GSO segments · 0c850344
      Toke Høiland-Jørgensen 提交于
      At lower bandwidths, the transmission time of a single GSO segment can add
      an unacceptable amount of latency due to HOL blocking. Furthermore, with a
      software shaper, any tuning mechanism employed by the kernel to control the
      maximum size of GSO segments is thrown off by the artificial limit on
      bandwidth. For this reason, we split GSO segments into their individual
      packets iff the shaper is active and configured to a bandwidth <= 1 Gbps.
      Signed-off-by: NToke Høiland-Jørgensen <toke@toke.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c850344
    • T
      sch_cake: Add overhead compensation support to the rate shaper · a729b7f0
      Toke Høiland-Jørgensen 提交于
      This commit adds configurable overhead compensation support to the rate
      shaper. With this feature, userspace can configure the actual bottleneck
      link overhead and encapsulation mode used, which will be used by the shaper
      to calculate the precise duration of each packet on the wire.
      
      This feature is needed because CAKE is often deployed one or two hops
      upstream of the actual bottleneck (which can be, e.g., inside a DSL or
      cable modem). In this case, the link layer characteristics and overhead
      reported by the kernel does not match the actual bottleneck. Being able to
      set the actual values in use makes it possible to configure the shaper rate
      much closer to the actual bottleneck rate (our experience shows it is
      possible to get with 0.1% of the actual physical bottleneck rate), thus
      keeping latency low without sacrificing bandwidth.
      
      The overhead compensation has three tunables: A fixed per-packet overhead
      size (which, if set, will be accounted from the IP packet header), a
      minimum packet size (MPU) and a framing mode supporting either ATM or PTM
      framing. We include a set of common keywords in TC to help users configure
      the right parameters. If no overhead value is set, the value reported by
      the kernel is used.
      Signed-off-by: NToke Høiland-Jørgensen <toke@toke.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a729b7f0