1. 31 7月, 2012 1 次提交
    • E
      mac80211: add PS flag to bss_conf · ab095877
      Eliad Peller 提交于
      Currently, ps mode is indicated per device (rather than
      per interface), which doesn't make a lot of sense.
      
      Moreover, there are subtle bugs caused by the inability
      to indicate ps change along with other changes
      (e.g. when the AP deauth us, we'd like to indicate
      CHANGED_PS | CHANGED_ASSOC, as changing PS before
      notifying about disassociation will result in null-packets
      being sent (if IEEE80211_HW_SUPPORTS_DYNAMIC_PS) while
      the sta is already disconnected.)
      
      Keep the current per-device notifications, and add
      parallel per-vif notifications.
      
      In order to keep it simple, the per-device ps and
      the per-vif ps are orthogonal - the per-vif ps
      configuration is determined only by the user
      configuration (enable/disable) and the connection
      state, and is not affected by other vifs state and
      (temporary) dynamic_ps/offchannel operations
      (unlike per-device ps).
      Signed-off-by: NEliad Peller <eliad@wizery.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      ab095877
  2. 27 7月, 2012 2 次提交
  3. 24 7月, 2012 2 次提交
    • D
      ipv4: Change rt->rt_iif encoding. · 13378cad
      David S. Miller 提交于
      On input packet processing, rt->rt_iif will be zero if we should
      use skb->dev->ifindex.
      
      Since we access rt->rt_iif consistently via inet_iif(), that is
      the only spot whose interpretation have to adjust.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13378cad
    • D
      ipv4: Prepare for change of rt->rt_iif encoding. · 92101b3b
      David S. Miller 提交于
      Use inet_iif() consistently, and for TCP record the input interface of
      cached RX dst in inet sock.
      
      rt->rt_iif is going to be encoded differently, so that we can
      legitimately cache input routes in the FIB info more aggressively.
      
      When the input interface is "use SKB device index" the rt->rt_iif will
      be set to zero.
      
      This forces us to move the TCP RX dst cache installation into the ipv4
      specific code, and as well it should since doing the route caching for
      ipv6 is pointless at the moment since it is not inspected in the ipv6
      input paths yet.
      
      Also, remove the unlikely on dst->obsolete, all ipv4 dsts have
      obsolete set to a non-zero value to force invocation of the check
      callback.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92101b3b
  4. 23 7月, 2012 4 次提交
    • E
      tcp: dont drop MTU reduction indications · 563d34d0
      Eric Dumazet 提交于
      ICMP messages generated in output path if frame length is bigger than
      mtu are actually lost because socket is owned by user (doing the xmit)
      
      One example is the ipgre_tunnel_xmit() calling
      icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
      
      We had a similar case fixed in commit a34a101e (ipv6: disable GSO on
      sockets hitting dst_allfrag).
      
      Problem of such fix is that it relied on retransmit timers, so short tcp
      sessions paid a too big latency increase price.
      
      This patch uses the tcp_release_cb() infrastructure so that MTU
      reduction messages (ICMP messages) are not lost, and no extra delay
      is added in TCP transmits.
      Reported-by: NMaciej Żenczykowski <maze@google.com>
      Diagnosed-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Tore Anderson <tore@fud.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      563d34d0
    • A
      get rid of ->scm_work_list · 6120d3db
      Al Viro 提交于
      recursion in __scm_destroy() will be cut by delaying final fput()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6120d3db
    • J
      net: netprio_cgroup: rework update socket logic · 406a3c63
      John Fastabend 提交于
      Instead of updating the sk_cgrp_prioidx struct field on every send
      this only updates the field when a task is moved via cgroup
      infrastructure.
      
      This allows sockets that may be used by a kernel worker thread
      to be managed. For example in the iscsi case today a user can
      put iscsid in a netprio cgroup and control traffic will be sent
      with the correct sk_cgrp_prioidx value set but as soon as data
      is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
      is updated with the kernel worker threads value which is the
      default case.
      
      It seems more correct to only update the field when the user
      explicitly sets it via control group infrastructure. This allows
      the users to manage sockets that may be used with other threads.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      406a3c63
    • N
      sctp: Implement quick failover draft from tsvwg · 5aa93bcf
      Neil Horman 提交于
      I've seen several attempts recently made to do quick failover of sctp transports
      by reducing various retransmit timers and counters.  While its possible to
      implement a faster failover on multihomed sctp associations, its not
      particularly robust, in that it can lead to unneeded retransmits, as well as
      false connection failures due to intermittent latency on a network.
      
      Instead, lets implement the new ietf quick failover draft found here:
      http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05
      
      This will let the sctp stack identify transports that have had a small number of
      errors, and avoid using them quickly until their reliability can be
      re-established.  I've tested this out on two virt guests connected via multiple
      isolated virt networks and believe its in compliance with the above draft and
      works well.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: Sridhar Samudrala <sri@us.ibm.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: linux-sctp@vger.kernel.org
      CC: joe@perches.com
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5aa93bcf
  5. 21 7月, 2012 20 次提交
  6. 20 7月, 2012 9 次提交
    • Y
      net-tcp: Fast Open client - cookie-less mode · 67da22d2
      Yuchung Cheng 提交于
      In trusted networks, e.g., intranet, data-center, the client does not
      need to use Fast Open cookie to mitigate DoS attacks. In cookie-less
      mode, sendmsg() with MSG_FASTOPEN flag will send SYN-data regardless
      of cookie availability.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67da22d2
    • Y
      net-tcp: Fast Open client - detecting SYN-data drops · aab48743
      Yuchung Cheng 提交于
      On paths with firewalls dropping SYN with data or experimental TCP options,
      Fast Open connections will have experience SYN timeout and bad performance.
      The solution is to track such incidents in the cookie cache and disables
      Fast Open temporarily.
      
      Since only the original SYN includes data and/or Fast Open option, the
      SYN-ACK has some tell-tale sign (tcp_rcv_fastopen_synack()) to detect
      such drops. If a path has recurring Fast Open SYN drops, Fast Open is
      disabled for 2^(recurring_losses) minutes starting from four minutes up to
      roughly one and half day. sendmsg with MSG_FASTOPEN flag will succeed but
      it behaves as connect() then write().
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aab48743
    • Y
      net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN) · cf60af03
      Yuchung Cheng 提交于
      sendmsg() (or sendto()) with MSG_FASTOPEN is a combo of connect(2)
      and write(2). The application should replace connect() with it to
      send data in the opening SYN packet.
      
      For blocking socket, sendmsg() blocks until all the data are buffered
      locally and the handshake is completed like connect() call. It
      returns similar errno like connect() if the TCP handshake fails.
      
      For non-blocking socket, it returns the number of bytes queued (and
      transmitted in the SYN-data packet) if cookie is available. If cookie
      is not available, it transmits a data-less SYN packet with Fast Open
      cookie request option and returns -EINPROGRESS like connect().
      
      Using MSG_FASTOPEN on connecting or connected socket will result in
      simlar errno like repeating connect() calls. Therefore the application
      should only use this flag on new sockets.
      
      The buffer size of sendmsg() is independent of the MSS of the connection.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf60af03
    • Y
      net-tcp: Fast Open client - sending SYN-data · 783237e8
      Yuchung Cheng 提交于
      This patch implements sending SYN-data in tcp_connect(). The data is
      from tcp_sendmsg() with flag MSG_FASTOPEN (implemented in a later patch).
      
      The length of the cookie in tcp_fastopen_req, init'd to 0, controls the
      type of the SYN. If the cookie is not cached (len==0), the host sends
      data-less SYN with Fast Open cookie request option to solicit a cookie
      from the remote. If cookie is not available (len > 0), the host sends
      a SYN-data with Fast Open cookie option. If cookie length is negative,
        the SYN will not include any Fast Open option (for fall back operations).
      
      To deal with middleboxes that may drop SYN with data or experimental TCP
      option, the SYN-data is only sent once. SYN retransmits do not include
      data or Fast Open options. The connection will fall back to regular TCP
      handshake.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      783237e8
    • Y
      net-tcp: Fast Open client - cookie cache · 1fe4c481
      Yuchung Cheng 提交于
      With help from Eric Dumazet, add Fast Open metrics in tcp metrics cache.
      The basic ones are MSS and the cookies. Later patch will cache more to
      handle unfriendly middleboxes.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1fe4c481
    • Y
      net-tcp: Fast Open base · 2100c8d2
      Yuchung Cheng 提交于
      This patch impelements the common code for both the client and server.
      
      1. TCP Fast Open option processing. Since Fast Open does not have an
         option number assigned by IANA yet, it shares the experiment option
         code 254 by implementing draft-ietf-tcpm-experimental-options
         with a 16 bits magic number 0xF989. This enables global experiments
         without clashing the scarce(2) experimental options available for TCP.
      
         When the draft status becomes standard (maybe), the client should
         switch to the new option number assigned while the server supports
         both numbers for transistion.
      
      2. The new sysctl tcp_fastopen
      
      3. A place holder init function
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2100c8d2
    • D
      net: Fix warnings in dst_ops.h · d8f1641b
      David S. Miller 提交于
      include/net/dst_ops.h:28:20: warning: ‘struct sock’ declared inside parameter list
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8f1641b
    • E
      ipv4: tcp: remove per net tcp_sock · be9f4a44
      Eric Dumazet 提交于
      tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
      per network namespace.
      
      This leads to bad behavior on multiqueue NICS, because many cpus
      contend for the socket lock and once socket lock is acquired, extra
      false sharing on various socket fields slow down the operations.
      
      To better resist to attacks, we use a percpu socket. Each cpu can
      run without contention, using appropriate memory (local node)
      
      Additional features :
      
      1) We also mirror the queue_mapping of the incoming skb, so that
      answers use the same queue if possible.
      
      2) Setting SOCK_USE_WRITE_QUEUE socket flag speedup sock_wfree()
      
      3) We now limit the number of in-flight RST/ACK [1] packets
      per cpu, instead of per namespace, and we honor the sysctl_wmem_default
      limit dynamically. (Prior to this patch, sysctl_wmem_default value was
      copied at boot time, so any further change would not affect tcp_sock
      limit)
      
      [1] These packets are only generated when no socket was matched for
      the incoming packet.
      Reported-by: NBill Sommerfeld <wsommerfeld@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be9f4a44
    • J
      ipv4: use seqlock for nh_exceptions · aee06da6
      Julian Anastasov 提交于
      Use global seqlock for the nh_exceptions. Call
      fnhe_oldest with the right hash chain. Correct the diff
      value for dst_set_expires.
      
      v2: after suggestions from Eric Dumazet:
      * get rid of spin lock fnhe_lock, rearrange update_or_create_fnhe
      * continue daddr search in rt_bind_exception
      
      v3:
      * remove the daddr check before seqlock in rt_bind_exception
      * restart lookup in rt_bind_exception on detected seqlock change,
      as suggested by David Miller
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aee06da6
  7. 19 7月, 2012 2 次提交