1. 04 9月, 2012 1 次提交
    • Y
      tcp: use PRR to reduce cwin in CWR state · 684bad11
      Yuchung Cheng 提交于
      Use proportional rate reduction (PRR) algorithm to reduce cwnd in CWR state,
      in addition to Recovery state. Retire the current rate-halving in CWR.
      When losses are detected via ACKs in CWR state, the sender enters Recovery
      state but the cwnd reduction continues and does not restart.
      
      Rename and refactor cwnd reduction functions since both CWR and Recovery
      use the same algorithm:
      tcp_init_cwnd_reduction() is new and initiates reduction state variables.
      tcp_cwnd_reduction() is previously tcp_update_cwnd_in_recovery().
      tcp_ends_cwnd_reduction() is previously  tcp_complete_cwr().
      
      The rate halving functions and logic such as tcp_cwnd_down(), tcp_min_cwnd(),
      and the cwnd moderation inside tcp_enter_cwr() are removed. The unused
      parameter, flag, in tcp_cwnd_reduction() is also removed.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      684bad11
  2. 03 9月, 2012 1 次提交
  3. 01 9月, 2012 3 次提交
    • J
      tcp: TCP Fast Open Server - support TFO listeners · 8336886f
      Jerry Chu 提交于
      This patch builds on top of the previous patch to add the support
      for TFO listeners. This includes -
      
      1. allocating, properly initializing, and managing the per listener
      fastopen_queue structure when TFO is enabled
      
      2. changes to the inet_csk_accept code to support TFO. E.g., the
      request_sock can no longer be freed upon accept(), not until 3WHS
      finishes
      
      3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg()
      if it's a TFO socket
      
      4. properly closing a TFO listener, and a TFO socket before 3WHS
      finishes
      
      5. supporting TCP_FASTOPEN socket option
      
      6. modifying tcp_check_req() to use to check a TFO socket as well
      as request_sock
      
      7. supporting TCP's TFO cookie option
      
      8. adding a new SYN-ACK retransmit handler to use the timer directly
      off the TFO socket rather than the listener socket. Note that TFO
      server side will not retransmit anything other than SYN-ACK until
      the 3WHS is completed.
      
      The patch also contains an important function
      "reqsk_fastopen_remove()" to manage the somewhat complex relation
      between a listener, its request_sock, and the corresponding child
      socket. See the comment above the function for the detail.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8336886f
    • J
      tcp: TCP Fast Open Server - header & support functions · 10467163
      Jerry Chu 提交于
      This patch adds all the necessary data structure and support
      functions to implement TFO server side. It also documents a number
      of flags for the sysctl_tcp_fastopen knob, and adds a few Linux
      extension MIBs.
      
      In addition, it includes the following:
      
      1. a new TCP_FASTOPEN socket option an application must call to
      supply a max backlog allowed in order to enable TFO on its listener.
      
      2. A number of key data structures:
      "fastopen_rsk" in tcp_sock - for a big socket to access its
      request_sock for retransmission and ack processing purpose. It is
      non-NULL iff 3WHS not completed.
      
      "fastopenq" in request_sock_queue - points to a per Fast Open
      listener data structure "fastopen_queue" to keep track of qlen (# of
      outstanding Fast Open requests) and max_qlen, among other things.
      
      "listener" in tcp_request_sock - to point to the original listener
      for book-keeping purpose, i.e., to maintain qlen against max_qlen
      as part of defense against IP spoofing attack.
      
      3. various data structure and functions, many in tcp_fastopen.c, to
      support server side Fast Open cookie operations, including
      /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10467163
    • A
      tcp: Increase timeout for SYN segments · 6c9ff979
      Alex Bergmann 提交于
      Commit 9ad7c049 ("tcp: RFC2988bis + taking RTT sample from 3WHS for
      the passive open side") changed the initRTO from 3secs to 1sec in
      accordance to RFC6298 (former RFC2988bis). This reduced the time till
      the last SYN retransmission packet gets sent from 93secs to 31secs.
      
      RFC1122 is stating that the retransmission should be done for at least 3
      minutes, but this seems to be quite high.
      
        "However, the values of R1 and R2 may be different for SYN
        and data segments.  In particular, R2 for a SYN segment MUST
        be set large enough to provide retransmission of the segment
        for at least 3 minutes.  The application can close the
        connection (i.e., give up on the open attempt) sooner, of
        course."
      
      This patch increases the value of TCP_SYN_RETRIES to the value of 6,
      providing a retransmission window of 63secs.
      
      The comments for SYN and SYNACK retries have also been updated to
      describe the current settings. The same goes for the documentation file
      "Documentation/networking/ip-sysctl.txt".
      Signed-off-by: NAlexander Bergmann <alex@linlab.net>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c9ff979
  4. 31 8月, 2012 1 次提交
  5. 30 8月, 2012 5 次提交
  6. 27 8月, 2012 1 次提交
  7. 24 8月, 2012 1 次提交
    • R
      packet: fix broken build. · f63c45e0
      Rami Rosen 提交于
      This patch fixes a broken build due to a missing header:
      ...
        CC      net/ipv4/proc.o
      In file included from include/net/net_namespace.h:15,
                       from net/ipv4/proc.c:35:
      include/net/netns/packet.h:11: error: field 'sklist_lock' has incomplete type
      ...
      
      The lock of netns_packet has been replaced by a recent patch to be a mutex instead of a spinlock,
      but we need to replace the header file to be linux/mutex.h instead of linux/spinlock.h as well.
      
      See commit 0fa7fa98:
      packet: Protect packet sk list with mutex (v2) patch,
      Signed-off-by: NRami Rosen <rosenr@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f63c45e0
  8. 23 8月, 2012 1 次提交
    • P
      packet: Protect packet sk list with mutex (v2) · 0fa7fa98
      Pavel Emelyanov 提交于
      Change since v1:
      
      * Fixed inuse counters access spotted by Eric
      
      In patch eea68e2f (packet: Report socket mclist info via diag module) I've
      introduced a "scheduling in atomic" problem in packet diag module -- the
      socket list is traversed under rcu_read_lock() while performed under it sk
      mclist access requires rtnl lock (i.e. -- mutex) to be taken.
      
      [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002
      [152363.820573] 4 locks held by crtools/12517:
      [152363.820581]  #0:  (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e
      [152363.820613]  #1:  (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a
      [152363.820644]  #2:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab
      [152363.820693]  #3:  (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af
      
      Similar thing was then re-introduced by further packet diag patches (fanount
      mutex and pgvec mutex for rings) :(
      
      Apart from being terribly sorry for the above, I propose to change the packet
      sk list protection from spinlock to mutex. This lock currently protects two
      modifications:
      
      * sklist
      * prot inuse counters
      
      The sklist modifications can be just reprotected with mutex since they already
      occur in a sleeping context. The inuse counters modifications are trickier -- the
      __this_cpu_-s are used inside, thus requiring the caller to handle the potential
      issues with contexts himself. Since packet sockets' counters are modified in two
      places only (packet_create and packet_release) we only need to protect the context
      from being preempted. BH disabling is not required in this case.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fa7fa98
  9. 22 8月, 2012 1 次提交
    • E
      af_netlink: force credentials passing [CVE-2012-3520] · e0e3cea4
      Eric Dumazet 提交于
      Pablo Neira Ayuso discovered that avahi and
      potentially NetworkManager accept spoofed Netlink messages because of a
      kernel bug.  The kernel passes all-zero SCM_CREDENTIALS ancillary data
      to the receiver if the sender did not provide such data, instead of not
      including any such data at all or including the correct data from the
      peer (as it is the case with AF_UNIX).
      
      This bug was introduced in commit 16e57262
      (af_unix: dont send SCM_CREDENTIALS by default)
      
      This patch forces passing credentials for netlink, as
      before the regression.
      
      Another fix would be to not add SCM_CREDENTIALS in
      netlink messages if not provided by the sender, but it
      might break some programs.
      
      With help from Florian Weimer & Petr Matousek
      
      This issue is designated as CVE-2012-3520
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Petr Matousek <pmatouse@redhat.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0e3cea4
  10. 20 8月, 2012 6 次提交
    • J
      mac80211: add IEEE80211_HW_P2P_DEV_ADDR_FOR_INTF · 6d71117a
      Johannes Berg 提交于
      Some devices like the current iwlwifi implementation
      require that the P2P interface address match the P2P
      Device address (only one P2P interface is supported.)
      Add the HW flag IEEE80211_HW_P2P_DEV_ADDR_FOR_INTF
      that allows drivers to request that P2P Interfaces
      added while a P2P Device is active get the same MAC
      address by default.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      6d71117a
    • J
      cfg80211: add P2P Device abstraction · 98104fde
      Johannes Berg 提交于
      In order to support using a different MAC address
      for the P2P Device address we must first have a
      P2P Device abstraction that can be assigned a MAC
      address.
      
      This abstraction will also be useful to support
      offloading P2P operations to the device, e.g.
      periodic listen for discoverability.
      
      Currently, the driver is responsible for assigning
      a MAC address to the P2P Device, but this could be
      changed by allowing a MAC address to be given to
      the NEW_INTERFACE command.
      
      As it has no associated netdev, a P2P Device can
      only be identified by its wdev identifier but the
      previous patches allowed using the wdev identifier
      in various APIs, e.g. remain-on-channel.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      98104fde
    • J
      mac80211: support A-MPDU status reporting · 4c298677
      Johannes Berg 提交于
      Support getting A-MPDU status information from the
      drivers and reporting it to userspace via radiotap
      in the standard fields.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      4c298677
    • J
      wireless: add radiotap A-MPDU status field · 48613ece
      Johannes Berg 提交于
      Define the A-MPDU status field in radiotap, also
      update the radiotap parser for it and the MCS field
      that was apparently missed last time.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      48613ece
    • A
      mac80211: add supported rates change notification in IBSS · e687f61e
      Antonio Quartulli 提交于
      In IBSS it is possible that the supported rates set for a station changes over
      time (e.g. it gets first initialised as an empty set because of no available
      information about rates and updated later). In this case the driver has to be
      notified about the change in order to update its internal table accordingly (if
      needed).
      
      This behaviour is needed by all those drivers that handle rc internally but
      leave stations management to mac80211
      Reported-by: NGui Iribarren <gui@altermundi.net>
      Signed-off-by: NAntonio Quartulli <ordex@autistici.org>
      [Johannes - add docs, validate IBSS mode only, fix compilation]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      e687f61e
    • P
      net: ipv6: fix oops in inet_putpeer() · 9d7b0fc1
      Patrick McHardy 提交于
      Commit 97bab73f (inet: Hide route peer accesses behind helpers.) introduced
      a bug in xfrm6_policy_destroy(). The xfrm_dst's _rt6i_peer member is not
      initialized, causing a false positive result from inetpeer_ptr_is_peer(),
      which in turn causes a NULL pointer dereference in inet_putpeer().
      
      Pid: 314, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #17 To Be Filled By O.E.M. To Be Filled By O.E.M./P4S800D-X
      EIP: 0060:[<c03abf93>] EFLAGS: 00010246 CPU: 0
      EIP is at inet_putpeer+0xe/0x16
      EAX: 00000000 EBX: f3481700 ECX: 00000000 EDX: 000dd641
      ESI: f3481700 EDI: c05e949c EBP: f551def4 ESP: f551def4
       DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
      CR0: 8005003b CR2: 00000070 CR3: 3243d000 CR4: 00000750
      DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
      DR6: ffff0ff0 DR7: 00000400
       f551df04 c0423de1 00000000 f3481700 f551df18 c038d5f7 f254b9f8 f551df28
       f34f85d8 f551df20 c03ef48d f551df3c c0396870 f30697e8 f24e1738 c05e98f4
       f5509540 c05cd2b4 f551df7c c0142d2b c043feb5 f5509540 00000000 c05cd2e8
       [<c0423de1>] xfrm6_dst_destroy+0x42/0xdb
       [<c038d5f7>] dst_destroy+0x1d/0xa4
       [<c03ef48d>] xfrm_bundle_flo_delete+0x2b/0x36
       [<c0396870>] flow_cache_gc_task+0x85/0x9f
       [<c0142d2b>] process_one_work+0x122/0x441
       [<c043feb5>] ? apic_timer_interrupt+0x31/0x38
       [<c03967eb>] ? flow_cache_new_hashrnd+0x2b/0x2b
       [<c0143e2d>] worker_thread+0x113/0x3cc
      
      Fix by adding a init_dst() callback to struct xfrm_policy_afinfo to
      properly initialize the dst's peer pointer.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d7b0fc1
  11. 16 8月, 2012 1 次提交
  12. 15 8月, 2012 18 次提交