1. 22 3月, 2013 25 次提交
  2. 21 3月, 2013 15 次提交
    • N
    • Z
      net: remove redundant ifdef CONFIG_CGROUPS · 4021db9a
      Zefan Li 提交于
      The cgroup code has been surrounded by ifdef CONFIG_NET_CLS_CGROUP
      and CONFIG_NETPRIO_CGROUP.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4021db9a
    • Y
      tcp: implement RFC5682 F-RTO · e33099f9
      Yuchung Cheng 提交于
      This patch implements F-RTO (foward RTO recovery):
      
      When the first retransmission after timeout is acknowledged, F-RTO
      sends new data instead of old data. If the next ACK acknowledges
      some never-retransmitted data, then the timeout was spurious and the
      congestion state is reverted.  Otherwise if the next ACK selectively
      acknowledges the new data, then the timeout was genuine and the
      loss recovery continues. This idea applies to recurring timeouts
      as well. While F-RTO sends different data during timeout recovery,
      it does not (and should not) change the congestion control.
      
      The implementaion follows the three steps of SACK enhanced algorithm
      (section 3) in RFC5682. Step 1 is in tcp_enter_loss(). Step 2 and
      3 are in tcp_process_loss().  The basic version is not supported
      because SACK enhanced version also works for non-SACK connections.
      
      The new implementation is functionally in parity with the old F-RTO
      implementation except the one case where it increases undo events:
      In addition to the RFC algorithm, a spurious timeout may be detected
      without sending data in step 2, as long as the SACK confirms not
      all the original data are dropped. When this happens, the sender
      will undo the cwnd and perhaps enter fast recovery instead. This
      additional check increases the F-RTO undo events by 5x compared
      to the prior implementation on Google Web servers, since the sender
      often does not have new data to send for HTTP.
      
      Note F-RTO may detect spurious timeout before Eifel with timestamps
      does so.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e33099f9
    • Y
      tcp: refactor CA_Loss state processing · ab42d9ee
      Yuchung Cheng 提交于
      Consolidate all of TCP CA_Loss state processing in
      tcp_fastretrans_alert() into a new function called tcp_process_loss().
      This is to prepare the new F-RTO implementation in the next patch.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab42d9ee
    • Y
      tcp: refactor F-RTO · 9b44190d
      Yuchung Cheng 提交于
      The patch series refactor the F-RTO feature (RFC4138/5682).
      
      This is to simplify the loss recovery processing. Existing F-RTO
      was developed during the experimental stage (RFC4138) and has
      many experimental features.  It takes a separate code path from
      the traditional timeout processing by overloading CA_Disorder
      instead of using CA_Loss state. This complicates CA_Disorder state
      handling because it's also used for handling dubious ACKs and undos.
      While the algorithm in the RFC does not change the congestion control,
      the implementation intercepts congestion control in various places
      (e.g., frto_cwnd in tcp_ack()).
      
      The new code implements newer F-RTO RFC5682 using CA_Loss processing
      path.  F-RTO becomes a small extension in the timeout processing
      and interfaces with congestion control and Eifel undo modules.
      It lets congestion control (module) determines how many to send
      independently.  F-RTO only chooses what to send in order to detect
      spurious retranmission. If timeout is found spurious it invokes
      existing Eifel undo algorithms like DSACK or TCP timestamp based
      detection.
      
      The first patch removes all F-RTO code except the sysctl_tcp_frto is
      left for the new implementation.  Since CA_EVENT_FRTO is removed, TCP
      westwood now computes ssthresh on regular timeout CA_EVENT_LOSS event.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b44190d
    • D
      filter: add minimal BPF JIT image disassembler · e306e2c1
      Daniel Borkmann 提交于
      This is a minimal stand-alone user space helper, that allows for debugging or
      verification of emitted BPF JIT images. This is in particular useful for
      emitted opcode debugging, since minor bugs in the JIT compiler can be fatal.
      The disassembler is architecture generic and uses libopcodes and libbfd.
      
      How to get to the disassembly, example:
      
        1) `echo 2 > /proc/sys/net/core/bpf_jit_enable`
        2) Load a BPF filter (e.g. `tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24`)
        3) Run e.g. `bpf_jit_disasm -o` to disassemble the most recent JIT code output
      
      `bpf_jit_disasm -o` will display the related opcodes to a particular instruction
      as well. Example for x86_64:
      
      $ ./bpf_jit_disasm
      94 bytes emitted from JIT compiler (pass:3, flen:9)
      ffffffffa0356000 + <x>:
         0:	push   %rbp
         1:	mov    %rsp,%rbp
         4:	sub    $0x60,%rsp
         8:	mov    %rbx,-0x8(%rbp)
         c:	mov    0x68(%rdi),%r9d
        10:	sub    0x6c(%rdi),%r9d
        14:	mov    0xe0(%rdi),%r8
        1b:	mov    $0xc,%esi
        20:	callq  0xffffffffe0d01b71
        25:	cmp    $0x86dd,%eax
        2a:	jne    0x000000000000003d
        2c:	mov    $0x14,%esi
        31:	callq  0xffffffffe0d01b8d
        36:	cmp    $0x6,%eax
      [...]
        5c:	leaveq
        5d:	retq
      
      $ ./bpf_jit_disasm -o
      94 bytes emitted from JIT compiler (pass:3, flen:9)
      ffffffffa0356000 + <x>:
         0:	push   %rbp
      	55
         1:	mov    %rsp,%rbp
      	48 89 e5
         4:	sub    $0x60,%rsp
      	48 83 ec 60
         8:	mov    %rbx,-0x8(%rbp)
      	48 89 5d f8
         c:	mov    0x68(%rdi),%r9d
      	44 8b 4f 68
        10:	sub    0x6c(%rdi),%r9d
      	44 2b 4f 6c
      [...]
        5c:	leaveq
      	c9
        5d:	retq
      	c3
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e306e2c1
    • D
      Merge branch 'for-davem' of... · b34870fc
      David S. Miller 提交于
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into wireless
      
      John W. Linville says:
      
      ====================
      This is a big pull request for new features intended for the 3.10
      stream...
      
      Regarding mac80211, Johannes says:
      
      "First, I merged mac80211/master to avoid some conflicts. This brings in
      a bunch of fixes you're already familiar with. For real -next material,
      I have a whole bunch of minstrel work, minstrel_ht from Felix and legacy
      minstrel from Thomas (Huehn). The other Thomas (Pedersen) did a number
      of changes in mesh to allow userspace peering management even when the
      mesh isn't secured. Stanislaw changes suspend/resume to always
      disconnect the networks. This is typically already done by
      network-manager so won't make a huge difference for most users, but
      fixes a number problems, particularly with USB drivers that can easily
      disconnect while suspended. Ilan has a small change to allow mac80211
      drivers to differentiate remain-on-channel reasons, and Jouni extends
      nl80211 to allow fast roaming with full-MAC devices. I have a fairly
      large number of patches as well, many of them fairly simple cleanups,
      but also allowing split wiphy dumps and adding back the full wiphy
      information in nl80211, station entry change checking and more VHT work
      including VHT capability overrides (mostly for testing purposes)."
      
      And for iwlwifi, Johannes says:
      
      "Here, I also merged iwlwifi-fixes to avoid conflicts, and otherwise have
      various cleanups and improvements on the MVM driver, along with a few
      throughout the driver. Other than Bluetooth Coexistence from Emmanuel
      there's no over-arching theme, so listing them would pretty much
      reproduce the shortlog."
      
      Regarding NFC, Samuel says:
      
      "The 2 features we have with this one are:
      
      - An LLCP Service Name Lookup (SNL) netlink interface for querying LLCP
        service availability from user space.
        Along the way, Thierry also improved the existing SNL interface for
        aggregating SNL responses.
      
      - An initial LLCP socket options implementation, for setting the Receive
        Window (RW) and the Maximum Information Unit Extension (MIUX) per socket.
        This is need for the LLCP validation tests.
      
      We also have a microread MEI build failure here: I am not sending this one to
      3.9 because the MEI bus code is not there yet, so it won't break for anyone
      else than me."
      
      And for ath6kl, Kalle says:
      
      "I added tracing support to ath6kl, along with a new Kconfig option. Now
      there's also a workaround to reset USB devices when the firmware upload
      fails, this happened when host was warm rebooted. There are also quite a
      few small fixes or cleanup."
      
      On top of all that, there is the usual bundle of driver updates
      with new features, new hardware support and the like mixed-in.
      The ath9k, b43, brcmfmac, mwifiex, rt2800, and wil6210 drivers
      are all well-represented, and a few other drivers are hit as well.
      I also pulled-in the wireless fixes tree in order to resolve some
      pending merge conflicts.
      
      Please let me know if there are problems!
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b34870fc
    • J
      Merge branch 'master' of... · 5470b462
      John W. Linville 提交于
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
      5470b462
    • S
      chelsio: use netdev_alloc_skb_ip_align · e76d120b
      stephen hemminger 提交于
      Use netdev_alloc_sk_ip_align in the case where packet is copied.
      This handles case where NET_IP_ALIGN == 0 as well as adding required header
      padding.
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e76d120b
    • D
    • D
      4c1d8d06
    • E
      chelsio: add headroom in RX path · 70386d40
      Eric Dumazet 提交于
      Drivers should reserve some headroom in skb used in receive path,
      to avoid future head reallocation.
      
      One possible way to do that is to use dev_alloc_skb() instead
      of alloc_skb(), so that NET_SKB_PAD bytes are reserved.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70386d40
    • C
      dynticks: avoid flow_cache_flush() interrupting every core · 8fdc929f
      Chris Metcalf 提交于
      Previously, if you did an "ifconfig down" or similar on one core, and
      the kernel had CONFIG_XFRM enabled, every core would be interrupted to
      check its percpu flow list for items that could be garbage collected.
      
      With this change, we generate a mask of cores that actually have any
      percpu items, and only interrupt those cores.  When we are trying to
      isolate a set of cpus from interrupts, this is important to do.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fdc929f
    • Y
      bnx2x: AER revised · 7fa6f340
      Yuval Mintz 提交于
      Revised bnx2x implementation of PCI Express Advanced Error Recovery -
      stop and free driver resources according to the AER flow (instead of the
      currently implemented `hope-for-the-best' release approach), and do not make
      any assumptions on the HW state after slot reset.
      Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: NAriel Elior <ariele@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7fa6f340
    • W
      net: fec: make local function fec_poll_controller() static · 47a5247f
      Wei Yongjun 提交于
      fec_poll_controller() was not declared. It should be static.
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47a5247f