提交 · 98e821a2a927b6dc0f7adc4b64ad29bec1b6ff89 · openanolis / cloud-kernel

22 3月, 2013 6 次提交

net: fix psock_fanout on sparc64 · 98e821a2

由 Willem de Bruijn 提交于 3月 21, 2013

The packetsocket fanout test uses a packet ring. Use TPACKET_V2
instead of TPACKET_V1 to work around a known 32/64 bit issue in
the older ring that manifests on sparc64.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98e821a2

netlink: Diag core and basic socket info dumping (v2) · eaaa3139

由 Andrey Vagin 提交于 3月 21, 2013

The netlink_diag can be built as a module, just like it's done in
unix sockets.

The core dumping message carries the basic info about netlink sockets:
family, type and protocol, portis, dst_group, dst_portid, state.

Groups can be received as an optional parameter NETLINK_DIAG_GROUPS.

Netlink sockets cab be filtered by protocols.

The socket inode number and cookie is reserved for future per-socket info
retrieving. The per-protocol filtering is also reserved for future by
requiring the sdiag_protocol to be zero.

The file /proc/net/netlink doesn't provide enough information for
dumping netlink sockets. It doesn't provide dst_group, dst_portid,
groups above 32.

v2: fix NETLINK_DIAG_MAX. Now it's equal to the last constant.
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eaaa3139

net: prepare netlink code for netlink diag · 0f29c768

由 Andrey Vagin 提交于 3月 21, 2013

Move a few declarations in a header.
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f29c768

gianfar: Remove superfluous kernel_dropped local counter · 953d2768

由 Claudiu Manoil 提交于 3月 21, 2013

The GRO_DROP return code is handled by the core network layer.
The current kernel approach is to factorize this kind of statistics into
the upper layers, instead of having all the drivers maintaining them.
Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

953d2768

gianfar: Cleanup dead code and minor formatting · c6e1160e

由 Claudiu Manoil 提交于 3月 21, 2013

Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6e1160e

gianfar: Remove 'maybe-uninitialized' compile warning · 39c0a0d5

由 Claudiu Manoil 提交于 3月 21, 2013

Warning message:
warning: 'budget_per_q' may be used uninitialized in this function

budget_per_q won't be used uninitialized since the only time
it doesn't get initialized is when entering gfar_poll with
num_act_queues == 0, meaning rstat_rxf == 0, in which case
budget_per_q is not utilized (as it has no meaning).
Inititalize budget_per_q to 0 though to suppress this compile
warning.
Signed-off-by: NClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39c0a0d5

21 3月, 2013 34 次提交

net: sh-eth: Use pr_err instead of printk · 14c3326a

由 Nobuhiro Iwamatsu 提交于 3月 20, 2013

Signed-off-by: NNobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14c3326a

net: remove redundant ifdef CONFIG_CGROUPS · 4021db9a

由 Zefan Li 提交于 3月 20, 2013

The cgroup code has been surrounded by ifdef CONFIG_NET_CLS_CGROUP
and CONFIG_NETPRIO_CGROUP.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4021db9a

tcp: implement RFC5682 F-RTO · e33099f9

由 Yuchung Cheng 提交于 3月 20, 2013

This patch implements F-RTO (foward RTO recovery):

When the first retransmission after timeout is acknowledged, F-RTO
sends new data instead of old data. If the next ACK acknowledges
some never-retransmitted data, then the timeout was spurious and the
congestion state is reverted.  Otherwise if the next ACK selectively
acknowledges the new data, then the timeout was genuine and the
loss recovery continues. This idea applies to recurring timeouts
as well. While F-RTO sends different data during timeout recovery,
it does not (and should not) change the congestion control.

The implementaion follows the three steps of SACK enhanced algorithm
(section 3) in RFC5682. Step 1 is in tcp_enter_loss(). Step 2 and
3 are in tcp_process_loss().  The basic version is not supported
because SACK enhanced version also works for non-SACK connections.

The new implementation is functionally in parity with the old F-RTO
implementation except the one case where it increases undo events:
In addition to the RFC algorithm, a spurious timeout may be detected
without sending data in step 2, as long as the SACK confirms not
all the original data are dropped. When this happens, the sender
will undo the cwnd and perhaps enter fast recovery instead. This
additional check increases the F-RTO undo events by 5x compared
to the prior implementation on Google Web servers, since the sender
often does not have new data to send for HTTP.

Note F-RTO may detect spurious timeout before Eifel with timestamps
does so.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e33099f9

tcp: refactor CA_Loss state processing · ab42d9ee

由 Yuchung Cheng 提交于 3月 20, 2013

Consolidate all of TCP CA_Loss state processing in
tcp_fastretrans_alert() into a new function called tcp_process_loss().
This is to prepare the new F-RTO implementation in the next patch.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab42d9ee

tcp: refactor F-RTO · 9b44190d

由 Yuchung Cheng 提交于 3月 20, 2013

The patch series refactor the F-RTO feature (RFC4138/5682).

This is to simplify the loss recovery processing. Existing F-RTO
was developed during the experimental stage (RFC4138) and has
many experimental features.  It takes a separate code path from
the traditional timeout processing by overloading CA_Disorder
instead of using CA_Loss state. This complicates CA_Disorder state
handling because it's also used for handling dubious ACKs and undos.
While the algorithm in the RFC does not change the congestion control,
the implementation intercepts congestion control in various places
(e.g., frto_cwnd in tcp_ack()).

The new code implements newer F-RTO RFC5682 using CA_Loss processing
path.  F-RTO becomes a small extension in the timeout processing
and interfaces with congestion control and Eifel undo modules.
It lets congestion control (module) determines how many to send
independently.  F-RTO only chooses what to send in order to detect
spurious retranmission. If timeout is found spurious it invokes
existing Eifel undo algorithms like DSACK or TCP timestamp based
detection.

The first patch removes all F-RTO code except the sysctl_tcp_frto is
left for the new implementation.  Since CA_EVENT_FRTO is removed, TCP
westwood now computes ssthresh on regular timeout CA_EVENT_LOSS event.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b44190d

filter: add minimal BPF JIT image disassembler · e306e2c1

由 Daniel Borkmann 提交于 3月 20, 2013

This is a minimal stand-alone user space helper, that allows for debugging or
verification of emitted BPF JIT images. This is in particular useful for
emitted opcode debugging, since minor bugs in the JIT compiler can be fatal.
The disassembler is architecture generic and uses libopcodes and libbfd.

How to get to the disassembly, example:

  1) `echo 2 > /proc/sys/net/core/bpf_jit_enable`
  2) Load a BPF filter (e.g. `tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24`)
  3) Run e.g. `bpf_jit_disasm -o` to disassemble the most recent JIT code output

`bpf_jit_disasm -o` will display the related opcodes to a particular instruction
as well. Example for x86_64:

$ ./bpf_jit_disasm
94 bytes emitted from JIT compiler (pass:3, flen:9)
ffffffffa0356000 + <x>:
   0:	push   %rbp
   1:	mov    %rsp,%rbp
   4:	sub    $0x60,%rsp
   8:	mov    %rbx,-0x8(%rbp)
   c:	mov    0x68(%rdi),%r9d
  10:	sub    0x6c(%rdi),%r9d
  14:	mov    0xe0(%rdi),%r8
  1b:	mov    $0xc,%esi
  20:	callq  0xffffffffe0d01b71
  25:	cmp    $0x86dd,%eax
  2a:	jne    0x000000000000003d
  2c:	mov    $0x14,%esi
  31:	callq  0xffffffffe0d01b8d
  36:	cmp    $0x6,%eax
[...]
  5c:	leaveq
  5d:	retq

$ ./bpf_jit_disasm -o
94 bytes emitted from JIT compiler (pass:3, flen:9)
ffffffffa0356000 + <x>:
   0:	push   %rbp
	55
   1:	mov    %rsp,%rbp
	48 89 e5
   4:	sub    $0x60,%rsp
	48 83 ec 60
   8:	mov    %rbx,-0x8(%rbp)
	48 89 5d f8
   c:	mov    0x68(%rdi),%r9d
	44 8b 4f 68
  10:	sub    0x6c(%rdi),%r9d
	44 2b 4f 6c
[...]
  5c:	leaveq
	c9
  5d:	retq
	c3
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e306e2c1

Merge branch 'for-davem' of... · b34870fc

由 David S. Miller 提交于 3月 21, 2013

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into wireless

John W. Linville says:

====================
This is a big pull request for new features intended for the 3.10
stream...

Regarding mac80211, Johannes says:

"First, I merged mac80211/master to avoid some conflicts. This brings in
a bunch of fixes you're already familiar with. For real -next material,
I have a whole bunch of minstrel work, minstrel_ht from Felix and legacy
minstrel from Thomas (Huehn). The other Thomas (Pedersen) did a number
of changes in mesh to allow userspace peering management even when the
mesh isn't secured. Stanislaw changes suspend/resume to always
disconnect the networks. This is typically already done by
network-manager so won't make a huge difference for most users, but
fixes a number problems, particularly with USB drivers that can easily
disconnect while suspended. Ilan has a small change to allow mac80211
drivers to differentiate remain-on-channel reasons, and Jouni extends
nl80211 to allow fast roaming with full-MAC devices. I have a fairly
large number of patches as well, many of them fairly simple cleanups,
but also allowing split wiphy dumps and adding back the full wiphy
information in nl80211, station entry change checking and more VHT work
including VHT capability overrides (mostly for testing purposes)."

And for iwlwifi, Johannes says:

"Here, I also merged iwlwifi-fixes to avoid conflicts, and otherwise have
various cleanups and improvements on the MVM driver, along with a few
throughout the driver. Other than Bluetooth Coexistence from Emmanuel
there's no over-arching theme, so listing them would pretty much
reproduce the shortlog."

Regarding NFC, Samuel says:

"The 2 features we have with this one are:

- An LLCP Service Name Lookup (SNL) netlink interface for querying LLCP
  service availability from user space.
  Along the way, Thierry also improved the existing SNL interface for
  aggregating SNL responses.

- An initial LLCP socket options implementation, for setting the Receive
  Window (RW) and the Maximum Information Unit Extension (MIUX) per socket.
  This is need for the LLCP validation tests.

We also have a microread MEI build failure here: I am not sending this one to
3.9 because the MEI bus code is not there yet, so it won't break for anyone
else than me."

And for ath6kl, Kalle says:

"I added tracing support to ath6kl, along with a new Kconfig option. Now
there's also a workaround to reset USB devices when the firmware upload
fails, this happened when host was warm rebooted. There are also quite a
few small fixes or cleanup."

On top of all that, there is the usual bundle of driver updates
with new features, new hardware support and the like mixed-in.
The ath9k, b43, brcmfmac, mwifiex, rt2800, and wil6210 drivers
are all well-represented, and a few other drivers are hit as well.
I also pulled-in the wireless fixes tree in order to resolve some
pending merge conflicts.

Please let me know if there are problems!
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b34870fc

Merge branch 'master' of... · 5470b462

由 John W. Linville 提交于 3月 20, 2013

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem

5470b462

chelsio: use netdev_alloc_skb_ip_align · e76d120b

由 stephen hemminger 提交于 3月 20, 2013

Use netdev_alloc_sk_ip_align in the case where packet is copied.
This handles case where NET_IP_ALIGN == 0 as well as adding required header
padding.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e76d120b

net: Move selftests to common net/ subdirectory. · a6f68034

由 David S. Miller 提交于 3月 20, 2013

Suggested-by: NDaniel Baluta <daniel.baluta@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6f68034

net: fix psock_fanout selftest bind error message · 4c1d8d06

由 Daniel Baluta 提交于 3月 20, 2013

Signed-off-by: NDaniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c1d8d06

chelsio: add headroom in RX path · 70386d40

由 Eric Dumazet 提交于 3月 20, 2013

Drivers should reserve some headroom in skb used in receive path,
to avoid future head reallocation.

One possible way to do that is to use dev_alloc_skb() instead
of alloc_skb(), so that NET_SKB_PAD bytes are reserved.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70386d40

dynticks: avoid flow_cache_flush() interrupting every core · 8fdc929f

由 Chris Metcalf 提交于 3月 19, 2013

Previously, if you did an "ifconfig down" or similar on one core, and
the kernel had CONFIG_XFRM enabled, every core would be interrupted to
check its percpu flow list for items that could be garbage collected.

With this change, we generate a mask of cores that actually have any
percpu items, and only interrupt those cores. When we are trying to
isolate a set of cpus from interrupts, this is important to do.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fdc929f

bnx2x: AER revised · 7fa6f340

由 Yuval Mintz 提交于 3月 20, 2013

Revised bnx2x implementation of PCI Express Advanced Error Recovery -
stop and free driver resources according to the AER flow (instead of the
currently implemented `hope-for-the-best' release approach), and do not make
any assumptions on the HW state after slot reset.
Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: NAriel Elior <ariele@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7fa6f340

net: fec: make local function fec_poll_controller() static · 47a5247f

由 Wei Yongjun 提交于 3月 20, 2013

fec_poll_controller() was not declared. It should be static.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47a5247f

net: ethernet: davinci_emac: make local function emac_poll_controller() static · e052a589

由 Wei Yongjun 提交于 3月 20, 2013

emac_poll_controller() was not declared. It should be static.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: NMugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e052a589

net: mdio-octeon: Use module_platform_driver() · 9fad0c94