1. 06 8月, 2014 5 次提交
    • W
      net-timestamp: SCHED timestamp on entering packet scheduler · e7fd2885
      Willem de Bruijn 提交于
      Kernel transmit latency is often incurred in the packet scheduler.
      Introduce a new timestamp on transmission just before entering the
      scheduler. When data travels through multiple devices (bonding,
      tunneling, ...) each device will export an individual timestamp.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7fd2885
    • W
      net-timestamp: add key to disambiguate concurrent datagrams · 09c2d251
      Willem de Bruijn 提交于
      Datagrams timestamped on transmission can coexist in the kernel stack
      and be reordered in packet scheduling. When reading looped datagrams
      from the socket error queue it is not always possible to unique
      correlate looped data with original send() call (for application
      level retransmits). Even if possible, it may be expensive and complex,
      requiring packet inspection.
      
      Introduce a data-independent ID mechanism to associate timestamps with
      send calls. Pass an ID alongside the timestamp in field ee_data of
      sock_extended_err.
      
      The ID is a simple 32 bit unsigned int that is associated with the
      socket and incremented on each send() call for which software tx
      timestamp generation is enabled.
      
      The feature is enabled only if SOF_TIMESTAMPING_OPT_ID is set, to
      avoid changing ee_data for existing applications that expect it 0.
      The counter is reset each time the flag is reenabled. Reenabling
      does not change the ID of already submitted data. It is possible
      to receive out of order IDs if the timestamp stream is not quiesced
      first.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09c2d251
    • W
      net-timestamp: move timestamp flags out of sk_flags · b9f40e21
      Willem de Bruijn 提交于
      sk_flags is reaching its limit. New timestamping options will not fit.
      Move all of them into a new field sk->sk_tsflags.
      
      Added benefit is that this removes boilerplate code to convert between
      SOF_TIMESTAMPING_.. and SOCK_TIMESTAMPING_.. in getsockopt/setsockopt.
      
      SOCK_TIMESTAMPING_RX_SOFTWARE is also used to toggle the receive
      timestamp logic (netstamp_needed). That can be simplified and this
      last key removed, but will leave that for a separate patch.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      
      ----
      
      The u16 in sock can be moved into a 16-bit hole below sk_gso_max_segs,
      though that scatters tstamp fields throughout the struct.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9f40e21
    • W
      net-timestamp: extend SCM_TIMESTAMPING ancillary data struct · f24b9be5
      Willem de Bruijn 提交于
      Applications that request kernel tx timestamps with SO_TIMESTAMPING
      read timestamps as recvmsg() ancillary data. The response is defined
      implicitly as timespec[3].
      
      1) define struct scm_timestamping explicitly and
      
      2) add support for new tstamp types. On tx, scm_timestamping always
         accompanies a sock_extended_err. Define previously unused field
         ee_info to signal the type of ts[0]. Introduce SCM_TSTAMP_SND to
         define the existing behavior.
      
      The reception path is not modified. On rx, no struct similar to
      sock_extended_err is passed along with SCM_TIMESTAMPING.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f24b9be5
    • N
      tcp: reduce spurious retransmits due to transient SACK reneging · 5ae344c9
      Neal Cardwell 提交于
      This commit reduces spurious retransmits due to apparent SACK reneging
      by only reacting to SACK reneging that persists for a short delay.
      
      When a sequence space hole at snd_una is filled, some TCP receivers
      send a series of ACKs as they apparently scan their out-of-order queue
      and cumulatively ACK all the packets that have now been consecutiveyly
      received. This is essentially misbehavior B in "Misbehaviors in TCP
      SACK generation" ACM SIGCOMM Computer Communication Review, April
      2011, so we suspect that this is from several common OSes (Windows
      2000, Windows Server 2003, Windows XP). However, this issue has also
      been seen in other cases, e.g. the netdev thread "TCP being hoodwinked
      into spurious retransmissions by lack of timestamps?" from March 2014,
      where the receiver was thought to be a BSD box.
      
      Since snd_una would temporarily be adjacent to a previously SACKed
      range in these scenarios, this receiver behavior triggered the Linux
      SACK reneging code path in the sender. This led the sender to clear
      the SACK scoreboard, enter CA_Loss, and spuriously retransmit
      (potentially) every packet from the entire write queue at line rate
      just a few milliseconds before the ACK for each packet arrives at the
      sender.
      
      To avoid such situations, now when a sender sees apparent reneging it
      does not yet retransmit, but rather adjusts the RTO timer to give the
      receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs
      that will restore sanity to the SACK scoreboard. If the reneging
      persists until this RTO then, as before, we clear the SACK scoreboard
      and enter CA_Loss.
      
      A 10ms delay tolerates a receiver sending such a stream of ACKs at
      56Kbit/sec. And to allow for receivers with slower or more congested
      paths, we wait for at least RTT/2.
      
      We validated the resulting max(RTT/2, 10ms) delay formula with a mix
      of North American and South American Google web server traffic, and
      found that for ACKs displaying transient reneging:
      
       (1) 90% of inter-ACK delays were less than 10ms
       (2) 99% of inter-ACK delays were less than RTT/2
      
      In tests on Google web servers this commit reduced reneging events by
      75%-90% (as measured by the TcpExtTCPSACKReneging counter), without
      any measurable impact on latency for user HTTP and SPDY requests.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ae344c9
  2. 05 8月, 2014 3 次提交
  3. 03 8月, 2014 13 次提交
  4. 01 8月, 2014 7 次提交
  5. 31 7月, 2014 12 次提交
    • P
      net: filter: don't release unattached filter through call_rcu() · 34c5bd66
      Pablo Neira 提交于
      sk_unattached_filter_destroy() does not always need to release the
      filter object via rcu. Since this filter is never attached to the
      socket, the caller should be responsible for releasing the filter
      in a safe way, which may not necessarily imply rcu.
      
      This is a short summary of clients of this function:
      
      1) xt_bpf.c and cls_bpf.c use the bpf matchers from rules, these rules
         are removed from the packet path before the filter is released. Thus,
         the framework makes sure the filter is safely removed.
      
      2) In the ppp driver, the ppp_lock ensures serialization between the
         xmit and filter attachment/detachment path. This doesn't use rcu
         so deferred release via rcu makes no sense.
      
      3) In the isdn/ppp driver, it is called from isdn_ppp_release()
         the isdn_ppp_ioctl(). This driver uses mutex and spinlocks, no rcu.
         Thus, deferred rcu makes no sense to me either, the deferred releases
         may be just masking the effects of wrong locking strategy, which
         should be fixed in the driver itself.
      
      4) In the team driver, this is the only place where the rcu
         synchronization with unattached filter is used. Therefore, this
         patch introduces synchronize_rcu() which is called from the
         genetlink path to make sure the filter doesn't go away while packets
         are still walking over it. I think we can revisit this once struct
         bpf_prog (that only wraps specific bpf code bits) is in place, then
         add some specific struct rcu_head in the scope of the team driver if
         Jiri thinks this is needed.
      
      Deferred rcu release for unattached filters was originally introduced
      in 302d6637 ("filter: Allow to create sk-unattached filters").
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34c5bd66
    • T
      net: Remove unlikely() for WARN_ON() conditions · 80019d31
      Thomas Graf 提交于
      No need for the unlikely(), WARN_ON() and BUG_ON() internally use
      unlikely() on the condition.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80019d31
    • A
      dcbnl : Fix misleading dcb_app->priority explanation · 16eecd9b
      Anish Bhatt 提交于
      Current explanation of dcb_app->priority is wrong. It says priority is
      expected to be a 3-bit unsigned integer which is only true when working with
      DCBx-IEEE. Use of dcb_app->priority by DCBx-CEE expects it to be 802.1p user
      priority bitmap. Updated accordingly
      
      This affects the cxgb4 driver, but I will post those changes as part of a
      larger changeset shortly.
      
      Fixes: 3e29027a ("dcbnl: add support for ieee8021Qaz attributes")
      Signed-off-by: NAnish Bhatt <anish@chelsio.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16eecd9b
    • J
      Bluetooth: Always use non-bonding requirement when not bondable · 82c295b1
      Johan Hedberg 提交于
      When we're not bondable we should never send any other SSP
      authentication requirement besides one of the non-bonding ones.
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      82c295b1
    • J
      Bluetooth: Rename pairable mgmt setting to bondable · b2939475
      Johan Hedberg 提交于
      This setting maps to the HCI_BONDABLE flag which tracks whether we're
      bondable or not. Therefore, rename the mgmt setting and respective
      command accordingly.
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      b2939475
    • J
      Bluetooth: Rename HCI_PAIRABLE to HCI_BONDABLE · b6ae8457
      Johan Hedberg 提交于
      The HCI_PAIRABLE flag isn't actually controlling whether we're pairable
      but whether we're bondable. Therefore, rename it accordingly.
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      b6ae8457
    • M
      Bluetooth: Fix sparse warning from HID new leds handling · bdb94346
      Marcel Holtmann 提交于
      The new leds bit handling produces this spares warning.
      
        CHECK   net/bluetooth/hidp/core.c
      net/bluetooth/hidp/core.c:156:60: warning: dubious: x | !y
      
      Just fix it by doing an explicit x << 0 shift operation.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      bdb94346
    • J
      Bluetooth: Fix check for connected state when pairing · 6f78fd4b
      Johan Hedberg 提交于
      Both BT_CONNECTED and BT_CONFIG state mean that we have a baseband link
      available. We should therefore check for either of these when pairing
      and deciding whether to call hci_conn_security() directly.
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      6f78fd4b
    • M
      6lowpan: iphc: Fix parenthesis alignments which off-by-one · 3fa71fe0
      Marcel Holtmann 提交于
      CHECK: Alignment should match open parenthesis
      +	if (((hdr->flow_lbl[0] & 0x0F) == 0) &&
      +	     (hdr->flow_lbl[1] == 0) && (hdr->flow_lbl[2] == 0)) {
      
      CHECK: Alignment should match open parenthesis
      +		if ((hdr->priority == 0) &&
      +		   ((hdr->flow_lbl[0] & 0xF0) == 0)) {
      
      CHECK: Alignment should match open parenthesis
      +		if ((hdr->priority == 0) &&
      +		   ((hdr->flow_lbl[0] & 0xF0) == 0)) {
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      3fa71fe0
    • M
      6lowpan: iphc: Fix missing braces for if statement · 9ab9bb00
      Marcel Holtmann 提交于
      CHECK: braces {} should be used on all arms of this statement
      +	if ((iphc0 & 0x03) != LOWPAN_IPHC_TTL_I)
      [...]
      +	else {
      [...]
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      9ab9bb00
    • M
      6lowpan: iphc: Fix missing blank line after variable declarations · 26fff593
      Marcel Holtmann 提交于
      WARNING: Missing a blank line after declarations
      +		struct sk_buff *new;
      +		if (uncompress_udp_header(skb, &uh))
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      26fff593
    • M
      6lowpan: iphc: Fix issues with alignment matching open parenthesis · 7fc4cfda
      Marcel Holtmann 提交于
      This patch fixes all the issues with alignment matching of open
      parenthesis found by checkpatch.pl and makes them follow the
      network coding style now.
      
      CHECK: Alignment should match open parenthesis
      +static int uncompress_addr(struct sk_buff *skb,
      +				struct in6_addr *ipaddr, const u8 address_mode,
      
      CHECK: Alignment should match open parenthesis
      +static int uncompress_context_based_src_addr(struct sk_buff *skb,
      +						struct in6_addr *ipaddr,
      
      CHECK: Alignment should match open parenthesis
      +static int skb_deliver(struct sk_buff *skb, struct ipv6hdr *hdr,
      +		struct net_device *dev, skb_delivery_cb deliver_skb)
      
      CHECK: Alignment should match open parenthesis
      +	new = skb_copy_expand(skb, sizeof(struct ipv6hdr), skb_tailroom(skb),
      +								GFP_ATOMIC);
      
      CHECK: Alignment should match open parenthesis
      +	raw_dump_table(__func__, "raw skb data dump before receiving",
      +			new->data, new->len);
      
      CHECK: Alignment should match open parenthesis
      +lowpan_uncompress_multicast_daddr(struct sk_buff *skb,
      +		struct in6_addr *ipaddr,
      
      CHECK: Alignment should match open parenthesis
      +	raw_dump_inline(NULL, "Reconstructed ipv6 multicast addr is",
      +				ipaddr->s6_addr, 16);
      
      CHECK: Alignment should match open parenthesis
      +int lowpan_process_data(struct sk_buff *skb, struct net_device *dev,
      +		const u8 *saddr, const u8 saddr_type, const u8 saddr_len,
      
      CHECK: Alignment should match open parenthesis
      +	raw_dump_table(__func__, "raw skb data dump uncompressed",
      +				skb->data, skb->len);
      
      CHECK: Alignment should match open parenthesis
      +		err = uncompress_addr(skb, &hdr.saddr, tmp, saddr,
      +					saddr_type, saddr_len);
      
      CHECK: Alignment should match open parenthesis
      +		err = uncompress_addr(skb, &hdr.daddr, tmp, daddr,
      +					daddr_type, daddr_len);
      
      CHECK: Alignment should match open parenthesis
      +		pr_debug("dest: stateless compression mode %d dest %pI6c\n",
      +			tmp, &hdr.daddr);
      
      CHECK: Alignment should match open parenthesis
      +		raw_dump_table(__func__, "raw UDP header dump",
      +				      (u8 *)&uh, sizeof(uh));
      
      CHECK: Alignment should match open parenthesis
      +	raw_dump_table(__func__, "raw header dump", (u8 *)&hdr,
      +							sizeof(hdr));
      
      CHECK: Alignment should match open parenthesis
      +int lowpan_header_compress(struct sk_buff *skb, struct net_device *dev,
      +			unsigned short type, const void *_daddr,
      
      CHECK: Alignment should match open parenthesis
      +	raw_dump_table(__func__, "raw skb network header dump",
      +		skb_network_header(skb), sizeof(struct ipv6hdr));
      
      CHECK: Alignment should match open parenthesis
      +	raw_dump_table(__func__,
      +			"sending raw skb network uncompressed packet",
      
      CHECK: Alignment should match open parenthesis
      +	if (((hdr->flow_lbl[0] & 0x0F) == 0) &&
      +	     (hdr->flow_lbl[1] == 0) && (hdr->flow_lbl[2] == 0)) {
      
      WARNING: quoted string split across lines
      +			pr_debug("dest address unicast link-local %pI6c "
      +				"iphc1 0x%02x\n", &hdr->daddr, iphc1);
      
      CHECK: Alignment should match open parenthesis
      +	raw_dump_table(__func__, "raw skb data dump compressed",
      +				skb->data, skb->len);
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      7fc4cfda