提交 · 87f20c26f9c0bedc39ff7d6682b2f3772da6e25b · openanolis / cloud-kernel

30 10月, 2013 5 次提交

net/benet: Make lancer_wait_ready() static · 87f20c26

由 Gavin Shan 提交于 10月 29, 2013

The function needn't to be public, so to make it as static.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87f20c26

net/benet: Remove interface type · 547e2dae

由 Gavin Shan 提交于 10月 29, 2013

The interface type, which is being traced by "struct be_adapter::
if_type", isn't used currently. So we can remove that safely
according to Sathya's comments.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

547e2dae

netconsole: Convert to pr_<level> · 22ded577

由 Joe Perches 提交于 10月 28, 2013

Use a more current logging style.

Convert printks to pr_<level>.

Consolidate multiple printks into a single printk to avoid
any possible dmesg interleaving.  Add a default "event" msg
in case the listed types are ever expanded.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22ded577

net: sched: cls_bpf: add BPF-based classifier · 7d1d65cb

由 Daniel Borkmann 提交于 10月 28, 2013

This work contains a lightweight BPF-based traffic classifier that can
serve as a flexible alternative to ematch-based tree classification, i.e.
now that BPF filter engine can also be JITed in the kernel. Naturally, tc
actions and policies are supported as well with cls_bpf. Multiple BPF
programs/filter can be attached for a class, or they can just as well be
written within a single BPF program, that's really up to the user how he
wishes to run/optimize the code, e.g. also for inversion of verdicts etc.
The notion of a BPF program's return/exit codes is being kept as follows:

     0: No match
    -1: Select classid given in "tc filter ..." command
  else: flowid, overwrite the default one

As a minimal usage example with iproute2, we use a 3 band prio root qdisc
on a router with sfq each as leave, and assign ssh and icmp bpf-based
filters to band 1, http traffic to band 2 and the rest to band 3. For the
first two bands we load the bytecode from a file, in the 2nd we load it
inline as an example:

echo 1 > /proc/sys/net/core/bpf_jit_enable

tc qdisc del dev em1 root
tc qdisc add dev em1 root handle 1: prio bands 3 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

tc qdisc add dev em1 parent 1:1 sfq perturb 16
tc qdisc add dev em1 parent 1:2 sfq perturb 16
tc qdisc add dev em1 parent 1:3 sfq perturb 16

tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/ssh.bpf flowid 1:1
tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/icmp.bpf flowid 1:1
tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/http.bpf flowid 1:2
tc filter add dev em1 parent 1: bpf run bytecode "`bpfc -f tc -i misc.ops`" flowid 1:3

BPF programs can be easily created and passed to tc, either as inline
'bytecode' or 'bytecode-file'. There are a couple of front-ends that can
compile opcodes, for example:

1) People familiar with tcpdump-like filters:

   tcpdump -iem1 -ddd port 22 | tr '\n' ',' > /etc/tc/ssh.bpf

2) People that want to low-level program their filters or use BPF
   extensions that lack support by libpcap's compiler:

   bpfc -f tc -i ssh.ops > /etc/tc/ssh.bpf

   ssh.ops example code:
   ldh [12]
   jne #0x800, drop
   ldb [23]
   jneq #6, drop
   ldh [20]
   jset #0x1fff, drop
   ldxb 4 * ([14] & 0xf)
   ldh [%x + 14]
   jeq #0x16, pass
   ldh [%x + 16]
   jne #0x16, drop
   pass: ret #-1
   drop: ret #0

It was chosen to load bytecode into tc, since the reverse operation,
tc filter list dev em1, is then able to show the exact commands again.
Possible follow-up work could also include a small expression compiler
for iproute2. Tested with the help of bmon. This idea came up during
the Netfilter Workshop 2013 in Copenhagen. Also thanks to feedback from
Eric Dumazet!
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d1d65cb

bgmac: separate RX descriptor setup code into a new function · d549c76b

由 Rafał Miłecki 提交于 10月 28, 2013

This cleans code a bit and will be useful when allocating buffers in
other places (like RX path, to avoid skb_copy_from_linear_data_offset).
Signed-off-by: NRafał Miłecki <zajec5@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d549c76b

29 10月, 2013 13 次提交

Z
net, mc: fix the incorrect comments in two mc-related functions · cdfb97bc
由 Zhi Yong Wu 提交于 10月 28, 2013
```
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
cdfb97bc
Z
net, iovec: fix the incorrect comment in memcpy_fromiovecend() · ab1a2d77
由 Zhi Yong Wu 提交于 10月 28, 2013
```
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
ab1a2d77
Z
net, datagram: fix the incorrect comment in zerocopy_sg_from_iovec() · c4e819d1
由 Zhi Yong Wu 提交于 10月 28, 2013
```
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
c4e819d1

vxlan: silence one build warning · 39deb2c7

由 Zhi Yong Wu 提交于 10月 28, 2013

drivers/net/vxlan.c: In function ‘vxlan_sock_add’:
drivers/net/vxlan.c:2298:11: warning: ‘sock’ may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/vxlan.c:2275:17: note: ‘sock’ was declared here
LD drivers/net/built-in.o
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39deb2c7

ipv4: fix DO and PROBE pmtu mode regarding local fragmentation with UFO/CORK · daba287b

由 Hannes Frederic Sowa 提交于 10月 27, 2013

UFO as well as UDP_CORK do not respect IP_PMTUDISC_DO and
IP_PMTUDISC_PROBE well enough.

UFO enabled packet delivery just appends all frags to the cork and hands
it over to the network card. So we just deliver non-DF udp fragments
(DF-flag may get overwritten by hardware or virtual UFO enabled
interface).

UDP_CORK does enqueue the data until the cork is disengaged. At this
point it sets the correct IP_DF and local_df flags and hands it over to
ip_fragment which in this case will generate an icmp error which gets
appended to the error socket queue. This is not reflected in the syscall
error (of course, if UFO is enabled this also won't happen).

Improve this by checking the pmtudisc flags before appending data to the
socket and if we still can fit all data in one packet when IP_PMTUDISC_DO
or IP_PMTUDISC_PROBE is set, only then proceed.

We use (mtu-fragheaderlen) to check for the maximum length because we
ensure not to generate a fragment and non-fragmented data does not need
to have its length aligned on 64 bit boundaries. Also the passed in
ip_options are already aligned correctly.

Maybe, we can relax some other checks around ip_fragment. This needs
more research.
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

daba287b

virtio_net: migrate mergeable rx buffers to page frag allocators · 2613af0e

由 Michael Dalton 提交于 10月 28, 2013

The virtio_net driver's mergeable receive buffer allocator
uses 4KB packet buffers. For MTU-sized traffic, SKB truesize
is > 4KB but only ~1500 bytes of the buffer is used to store
packet data, reducing the effective TCP window size
substantially. This patch addresses the performance concerns
with mergeable receive buffers by allocating MTU-sized packet
buffers using page frag allocators. If more than MAX_SKB_FRAGS
buffers are needed, the SKB frag_list is used.
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2613af0e

ipv6: Remove privacy config option. · 5d9efa7e

由 David S. Miller 提交于 10月 28, 2013

The code for privacy extentions is very mature, and making it
configurable only gives marginal memory/code savings in exchange
for obfuscation and hard to read code via CPP ifdef'ery.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d9efa7e

Merge branch '6lowpan' · d5d45d42

由 David S. Miller 提交于 10月 28, 2013

Alexander Aring says:

====================
6lowpan: trivial changes

This patch series includes some trivial changes to prepare the 6lowpan stack
for upcomming patch-series which mainly fix fragmentation according to rfc4944
and udp handling(which is currently broken).

Changes since v3:
  - really fix intendation in patch 3/5

Changes since v2:
  - change intendation in patch 3/5
  - fix typo in 5/5 unecessary -> unnecessary
  - add missing 6lowpan tag in cover-letter
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5d45d42

6lowpan: remove unnecessary break · 8ef007fd

由 Alexander Aring 提交于 10月 28, 2013

Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Reviewed-by: NWerner Almesberger <werner@almesberger.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ef007fd

6lowpan: remove skb->dev assignment · b236b954

由 Alexander Aring 提交于 10月 28, 2013

This patch removes the assignment of skb->dev. We don't need it here because
we use the netdev_alloc_skb_ip_align function which already sets the
skb->dev.
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Reviewed-by: NWerner Almesberger <werner@almesberger.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b236b954

6lowpan: use netdev_alloc_skb instead dev_alloc_skb · b614442f

由 Alexander Aring 提交于 10月 28, 2013

This patch uses the netdev_alloc_skb instead dev_alloc_skb function and
drops the seperate assignment to skb->dev.
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Reviewed-by: NWerner Almesberger <werner@almesberger.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b614442f

6lowpan: remove unnecessary check on err >= 0 · 53cb5717

由 Alexander Aring 提交于 10月 28, 2013

The err variable can only be zero in this case.
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Reviewed-by: NWerner Almesberger <werner@almesberger.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53cb5717

6lowpan: remove unnecessary ret variable · 545f3613

由 Alexander Aring 提交于 10月 28, 2013

Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Reviewed-by: NWerner Almesberger <werner@almesberger.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

545f3613

28 10月, 2013 13 次提交

sctp: merge two if statements to one · 747edc0f

由 wangweidong 提交于 10月 26, 2013

Two if statements do the same work, we can merge them to
one. And fix some typos. There is just code simplification,
no functional changes.
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

747edc0f

sctp: remove the repeat initialize with 0 · 3dc0a548

由 wangweidong 提交于 10月 26, 2013

kmem_cache_zalloc had set the allocated memory to zero. I think no need
to initialize with 0. And move the comments to the function begin.
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dc0a548

sctp: fix some comments in chunk.c and associola.c · 2bccbadf

由 wangweidong 提交于 10月 26, 2013

fix some typos
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bccbadf

veth: extend features to support tunneling · 82d81898

由 Eric Dumazet 提交于 10月 25, 2013

While investigating on a recent vxlan regression, I found veth
was using a zero features set for vxlan tunnels.

We have to segment GSO frames, copy the payload, and do the checksum.

This patch brings a ~200% performance increase

We probably have to add hw_enc_features support
on other virtual devices.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82d81898

inet: restore gso for vxlan · 8c3a897b

由 Eric Dumazet 提交于 10月 27, 2013

Alexei reported a performance regression on vxlan, caused
by commit 3347c960 "ipv4: gso: make inet_gso_segment() stackable"

GSO vxlan packets were not properly segmented, adding IP fragments
while they were not expected.

Rename 'bool tunnel' to 'bool encap', and add a new boolean
to express the fact that UDP should be fragmented.
This fragmentation is triggered by skb->encapsulation being set.

Remove a "skb->encapsulation = 1" added in above commit,
as its not needed, as frags inherit skb->frag from original
GSO skb.
Reported-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Tested-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c3a897b

Revert "Merge branch 'bonding_monitor_locking'" · 1f2cd845

由 David S. Miller 提交于 10月 28, 2013

This reverts commit 4d961a10, reversing
changes made to a00f6fcc.

Revert bond locking changes, they cause regressions and Veaceslav Falico
doesn't like how the commit messages were done at all.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f2cd845

be2net: add support for ndo_busy_poll · 6384a4d0

由 Sathya Perla 提交于 10月 25, 2013

Includes:
- ndo_busy_poll implementation
- Locking between napi and busy_poll
- Fix rx_post_starvation (replenish rx-queues in out-of-mememory scenario)
  logic to accomodate busy_poll.

v2 changes:
[Eric D.'s comment] call alloc_pages() with GFP_ATOMIC even in ndo_busy_poll
context as it is not allowed to sleep.
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6384a4d0

Merge branch 'bonding_monitor_locking' · 4d961a10

由 David S. Miller 提交于 10月 27, 2013

Ding Tianhong says:

====================
bonding: patchset for rcu use in bonding

The slave list will add and del by bond_master_upper_dev_link() and
bond_upper_dev_unlink(), which will call call_netdevice_notifiers(),
even it is safe to call it in write bond lock now, but we can't sure
that whether it is safe later, because other drivers may deal
NETDEV_CHANGEUPPER in sleep way, so I didn't admit move the
bond_upper_dev_unlink() in write bond lock.

now the bond_for_each_slave only protect by rtnl_lock(), maybe use
bond_for_each_slave_rcu is a good way to protect slave list for bond,
but as a system slow path, it is no need to transform
bond_for_each_slave() to bond_for_each_slave_rcu() in slow path, so in
the patchset, I will remove the unused read bond lock for monitor
function, maybe it is a better way, I will wait to accept any relay
for it.

Thanks for the Veaceslav Falico opinion.

v2: add and modify commit for patchset and patch, it will be the first
step for the whole patchset.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d961a10

bonding: remove bond read lock for bond_3ad_state_machine_handler() · 5cc172c6