1. 13 10月, 2016 1 次提交
    • P
      cfg80211: support virtual interfaces with different beacon intervals · 0c317a02
      Purushottam Kushwaha 提交于
      This commit provides a mechanism for the host drivers to advertise the
      support for different beacon intervals among the respective interface
      combinations in a group, through NL80211_IFACE_COMB_BI_MIN_GCD (u32).
      
      This value will be compared against GCD of all beaconing interfaces of
      matching combinations.
      
      If the driver doesn't advertise this value, the old behaviour where
      all beacon intervals must be identical is retained.
      
      If it is specified, then any beacon interval for an interface in the
      interface combination as well as the GCD of all active beacon intervals
      in the combination must be greater or equal to this value.
      Signed-off-by: NPurushottam Kushwaha <pkushwah@qti.qualcomm.com>
      [change commit message, some variable names, small other things]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      0c317a02
  2. 12 10月, 2016 2 次提交
  3. 08 10月, 2016 10 次提交
  4. 04 10月, 2016 1 次提交
  5. 01 10月, 2016 2 次提交
  6. 30 9月, 2016 7 次提交
  7. 28 9月, 2016 1 次提交
  8. 27 9月, 2016 1 次提交
  9. 26 9月, 2016 4 次提交
    • J
      cfg80211: add checks for beacon rate, extend to mesh · 8564e382
      Johannes Berg 提交于
      The previous commit added support for specifying the beacon rate
      for AP mode. Add features checks to this, and extend it to also
      support the rate configuration for mesh networks. For IBSS it's
      not as simple due to joining etc., so that's not yet supported.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      8564e382
    • L
      netfilter: nft_log: complete NFTA_LOG_FLAGS attr support · ff107d27
      Liping Zhang 提交于
      NFTA_LOG_FLAGS attribute is already supported, but the related
      NF_LOG_XXX flags are not exposed to the userspace. So we cannot
      explicitly enable log flags to log uid, tcp sequence, ip options
      and so on, i.e. such rule "nft add rule filter output log uid"
      is not supported yet.
      
      So move NF_LOG_XXX macro definitions to the uapi/../nf_log.h. In
      order to keep consistent with other modules, change NF_LOG_MASK to
      refer to all supported log flags. On the other hand, add a new
      NF_LOG_DEFAULT_MASK to refer to the original default log flags.
      
      Finally, if user specify the unsupported log flags or NFTA_LOG_GROUP
      and NFTA_LOG_FLAGS are set at the same time, report EINVAL to the
      userspace.
      Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ff107d27
    • P
      netfilter: nf_tables: add range expression · 0f3cd9b3
      Pablo Neira Ayuso 提交于
      Inverse ranges != [a,b] are not currently possible because rules are
      composites of && operations, and we need to express this:
      
      	data < a || data > b
      
      This patch adds a new range expression. Positive ranges can be already
      through two cmp expressions:
      
      	cmp(sreg, data, >=)
      	cmp(sreg, data, <=)
      
      This new range expression provides an alternative way to express this.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      0f3cd9b3
    • T
      ALSA: control: cage TLV_DB_RANGE_HEAD in kernel land because it was obsoleted · 318824d3
      Takashi Sakamoto 提交于
      In commit bf1d1c9b ("ALSA: tlv: add DECLARE_TLV_DB_RANGE()"), the new
      macro was added so that "dB range information can be specified without
      having to count the items manually for TLV_DB_RANGE_HEAD()". In short,
      TLV_DB_RANGE_HEAD macro was obsoleted.
      
      In commit 46e860f7 ("ALSA: rename TLV-related macros so that they're
      friendly to user applications"), TLV-related macros are exposed for
      applications in user land to get content of data structured by
      Type/Length/Value shape. The commit managed to expose TLV-related macros
      as many as possible, while obsoleted TLV_DB_RANGE_HEAD() was included to
      the list of exposed macros.
      
      This situation brings some confusions to application developers because
      they might think all exposed macros have their own purpose and useful for
      applications.
      
      For the reason, this commit moves TLV_DB_RANGE_HEAD macro from UAPI header
      to a header for kernel land, again. The above commit is done within the
      same development period for kernel 4.9, thus not published yet. This
      commit might certainly brings no confusions to user land.
      
      Reference: commit bf1d1c9b ("ALSA: tlv: add DECLARE_TLV_DB_RANGE()")
      Reference: commit 46e860f7 ("ALSA: rename TLV-related macros so that they're friendly to user applications")
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      318824d3
  10. 25 9月, 2016 1 次提交
  11. 24 9月, 2016 1 次提交
    • M
      net: Update API for VF vlan protocol 802.1ad support · 79aab093
      Moshe Shemesh 提交于
      Introduce new rtnl UAPI that exposes a list of vlans per VF, giving
      the ability for user-space application to specify it for the VF, as an
      option to support 802.1ad.
      We adjusted IP Link tool to support this option.
      
      For future use cases, the new UAPI supports multiple vlans. For now we
      limit the list size to a single vlan in kernel.
      Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
      compatibility with older versions of IP Link tool.
      
      Add a vlan protocol parameter to the ndo_set_vf_vlan callback.
      We kept 802.1Q as the drivers' default vlan protocol.
      Suitable ip link tool command examples:
        Set vf vlan protocol 802.1ad:
          ip link set eth0 vf 1 vlan 100 proto 802.1ad
        Set vf to VST (802.1Q) mode:
          ip link set eth0 vf 1 vlan 100 proto 802.1Q
        Or by omitting the new parameter
          ip link set eth0 vf 1 vlan 100
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79aab093
  12. 23 9月, 2016 5 次提交
    • D
      bpf: add helper to invalidate hash · 7a4b28c6
      Daniel Borkmann 提交于
      Add a small helper that complements 36bbef52 ("bpf: direct packet
      write and access for helpers for clsact progs") for invalidating the
      current skb->hash after mangling on headers via direct packet write.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a4b28c6
    • E
      net_sched: sch_fq: account for schedule/timers drifts · fefa569a
      Eric Dumazet 提交于
      It looks like the following patch can make FQ very precise, even in VM
      or stressed hosts. It matters at high pacing rates.
      
      We take into account the difference between the time that was programmed
      when last packet was sent, and current time (a drift of tens of usecs is
      often observed)
      
      Add an EWMA of the unthrottle latency to help diagnostics.
      
      This latency is the difference between current time and oldest packet in
      delayed RB-tree. This accounts for the high resolution timer latency,
      but can be different under stress, as fq_check_throttled() can be
      opportunistically be called from a dequeue() called after an enqueue()
      for a different flow.
      
      Tested:
      // Start a 10Gbit flow
      $ netperf --google-pacing-rate 1250000000 -H lpaa24 -l 10000 -- -K bbr &
      
      Before patch :
      $ sar -n DEV 10 5 | grep eth0 | grep Average
      Average:         eth0  17106.04 756876.84   1102.75 1119049.02      0.00      0.00      0.52
      
      After patch :
      $ sar -n DEV 10 5 | grep eth0 | grep Average
      Average:         eth0  17867.00 800245.90   1151.77 1183172.12      0.00      0.00      0.52
      
      A new iproute2 tc can output the 'unthrottle latency' :
      
      $ tc -s qd sh dev eth0 | grep latency
        0 gc, 0 highprio, 32490767 throttled, 2382 ns latency
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fefa569a
    • L
      netfilter: nft_queue: add _SREG_QNUM attr to select the queue number · 8061bb54
      Liping Zhang 提交于
      Currently, the user can specify the queue numbers by _QUEUE_NUM and
      _QUEUE_TOTAL attributes, this is enough in most situations.
      
      But acctually, it is not very flexible, for example:
        tcp dport 80 mapped to queue0
        tcp dport 81 mapped to queue1
        tcp dport 82 mapped to queue2
      In order to do this thing, we must add 3 nft rules, and more
      mapping meant more rules ...
      
      So take one register to select the queue number, then we can add one
      simple rule to mapping queues, maybe like this:
        queue num tcp dport map { 80:0, 81:1, 82:2 ... }
      
      Florian Westphal also proposed wider usage scenarios:
        queue num jhash ip saddr . ip daddr mod ...
        queue num meta cpu ...
        queue num meta mark ...
      
      The last point is how to load a queue number from sreg, although we can
      use *(u16*)&regs->data[reg] to load the queue number, just like nat expr
      to load its l4port do.
      
      But we will cooperate with hash expr, meta cpu, meta mark expr and so on.
      They all store the result to u32 type, so cast it to u16 pointer and
      dereference it will generate wrong result in the big endian system.
      
      So just keep it simple, we treat queue number as u32 type, although u16
      type is already enough.
      Suggested-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8061bb54
    • A
      nsfs: add ioctl to get a parent namespace · a7306ed8
      Andrey Vagin 提交于
      Pid and user namepaces are hierarchical. There is no way to discover
      parent-child relationships.
      
      In a future we will use this interface to dump and restore nested
      namespaces.
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Signed-off-by: NAndrei Vagin <avagin@openvz.org>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      a7306ed8
    • A
      nsfs: add ioctl to get an owning user namespace for ns file descriptor · 6786741d
      Andrey Vagin 提交于
      Each namespace has an owning user namespace and now there is not way
      to discover these relationships.
      
      Understending namespaces relationships allows to answer the question:
      what capability does process X have to perform operations on a resource
      governed by namespace Y?
      
      After a long discussion, Eric W. Biederman proposed to use ioctl-s for
      this purpose.
      
      The NS_GET_USERNS ioctl returns a file descriptor to an owning user
      namespace.
      It returns EPERM if a target namespace is outside of a current user
      namespace.
      
      v2: rename parent to relative
      
      v3: Add a missing mntput when returning -EAGAIN --EWB
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Link: https://lkml.org/lkml/2016/7/6/158Signed-off-by: NAndrei Vagin <avagin@openvz.org>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      6786741d
  13. 22 9月, 2016 3 次提交
  14. 21 9月, 2016 1 次提交
    • N
      tcp_bbr: add BBR congestion control · 0f8782ea
      Neal Cardwell 提交于
      This commit implements a new TCP congestion control algorithm: BBR
      (Bottleneck Bandwidth and RTT). A detailed description of BBR will be
      published in ACM Queue, Vol. 14 No. 5, September-October 2016, as
      "BBR: Congestion-Based Congestion Control".
      
      BBR has significantly increased throughput and reduced latency for
      connections on Google's internal backbone networks and google.com and
      YouTube Web servers.
      
      BBR requires only changes on the sender side, not in the network or
      the receiver side. Thus it can be incrementally deployed on today's
      Internet, or in datacenters.
      
      The Internet has predominantly used loss-based congestion control
      (largely Reno or CUBIC) since the 1980s, relying on packet loss as the
      signal to slow down. While this worked well for many years, loss-based
      congestion control is unfortunately out-dated in today's networks. On
      today's Internet, loss-based congestion control causes the infamous
      bufferbloat problem, often causing seconds of needless queuing delay,
      since it fills the bloated buffers in many last-mile links. On today's
      high-speed long-haul links using commodity switches with shallow
      buffers, loss-based congestion control has abysmal throughput because
      it over-reacts to losses caused by transient traffic bursts.
      
      In 1981 Kleinrock and Gale showed that the optimal operating point for
      a network maximizes delivered bandwidth while minimizing delay and
      loss, not only for single connections but for the network as a
      whole. Finding that optimal operating point has been elusive, since
      any single network measurement is ambiguous: network measurements are
      the result of both bandwidth and propagation delay, and those two
      cannot be measured simultaneously.
      
      While it is impossible to disambiguate any single bandwidth or RTT
      measurement, a connection's behavior over time tells a clearer
      story. BBR uses a measurement strategy designed to resolve this
      ambiguity. It combines these measurements with a robust servo loop
      using recent control systems advances to implement a distributed
      congestion control algorithm that reacts to actual congestion, not
      packet loss or transient queue delay, and is designed to converge with
      high probability to a point near the optimal operating point.
      
      In a nutshell, BBR creates an explicit model of the network pipe by
      sequentially probing the bottleneck bandwidth and RTT. On the arrival
      of each ACK, BBR derives the current delivery rate of the last round
      trip, and feeds it through a windowed max-filter to estimate the
      bottleneck bandwidth. Conversely it uses a windowed min-filter to
      estimate the round trip propagation delay. The max-filtered bandwidth
      and min-filtered RTT estimates form BBR's model of the network pipe.
      
      Using its model, BBR sets control parameters to govern sending
      behavior. The primary control is the pacing rate: BBR applies a gain
      multiplier to transmit faster or slower than the observed bottleneck
      bandwidth. The conventional congestion window (cwnd) is now the
      secondary control; the cwnd is set to a small multiple of the
      estimated BDP (bandwidth-delay product) in order to allow full
      utilization and bandwidth probing while bounding the potential amount
      of queue at the bottleneck.
      
      When a BBR connection starts, it enters STARTUP mode and applies a
      high gain to perform an exponential search to quickly probe the
      bottleneck bandwidth (doubling its sending rate each round trip, like
      slow start). However, instead of continuing until it fills up the
      buffer (i.e. a loss), or until delay or ACK spacing reaches some
      threshold (like Hystart), it uses its model of the pipe to estimate
      when that pipe is full: it estimates the pipe is full when it notices
      the estimated bandwidth has stopped growing. At that point it exits
      STARTUP and enters DRAIN mode, where it reduces its pacing rate to
      drain the queue it estimates it has created.
      
      Then BBR enters steady state. In steady state, PROBE_BW mode cycles
      between first pacing faster to probe for more bandwidth, then pacing
      slower to drain any queue that created if no more bandwidth was
      available, and then cruising at the estimated bandwidth to utilize the
      pipe without creating excess queue. Occasionally, on an as-needed
      basis, it sends significantly slower to probe for RTT (PROBE_RTT
      mode).
      
      BBR has been fully deployed on Google's wide-area backbone networks
      and we're experimenting with BBR on Google.com and YouTube on a global
      scale.  Replacing CUBIC with BBR has resulted in significant
      improvements in network latency and application (RPC, browser, and
      video) metrics. For more details please refer to our upcoming ACM
      Queue publication.
      
      Example performance results, to illustrate the difference between BBR
      and CUBIC:
      
      Resilience to random loss (e.g. from shallow buffers):
        Consider a netperf TCP_STREAM test lasting 30 secs on an emulated
        path with a 10Gbps bottleneck, 100ms RTT, and 1% packet loss
        rate. CUBIC gets 3.27 Mbps, and BBR gets 9150 Mbps (2798x higher).
      
      Low latency with the bloated buffers common in today's last-mile links:
        Consider a netperf TCP_STREAM test lasting 120 secs on an emulated
        path with a 10Mbps bottleneck, 40ms RTT, and 1000-packet bottleneck
        buffer. Both fully utilize the bottleneck bandwidth, but BBR
        achieves this with a median RTT 25x lower (43 ms instead of 1.09
        secs).
      
      Our long-term goal is to improve the congestion control algorithms
      used on the Internet. We are hopeful that BBR can help advance the
      efforts toward this goal, and motivate the community to do further
      research.
      
      Test results, performance evaluations, feedback, and BBR-related
      discussions are very welcome in the public e-mail list for BBR:
      
        https://groups.google.com/forum/#!forum/bbr-dev
      
      NOTE: BBR *must* be used with the fq qdisc ("man tc-fq") with pacing
      enabled, since pacing is integral to the BBR design and
      implementation. BBR without pacing would not function properly, and
      may incur unnecessary high packet loss rates.
      Signed-off-by: NVan Jacobson <vanj@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f8782ea